ArXiv TLDR

AutonomyLens: A Self-Evolving Simulation-Based Testing Loop for Autonomous Systems

🐦 Tweet
2604.11672

Ankit Agrawal, Jithin Garapati, Bohan Zhang

cs.SE

TLDR

AutonomyLens is an LLM-driven framework that unifies scenario design, simulation, and analysis for autonomous system validation, improving traceability and reproducibility.

Key contributions

  • Integrates scenario specification, simulation execution, and telemetry analysis into a unified validation workflow.
  • Introduces a structured representation for mission-level autonomous system scenarios.
  • Provides automated execution and context-aware analysis of system behavior.
  • Generates counterfactual scenarios from observed failures to refine and synthesize new test cases.

Why it matters

This framework addresses the fragmentation in autonomous system validation, which currently limits reproducibility and slows iteration. By unifying the workflow, AutonomyLens enhances traceability, reproducibility, and scalability. This is crucial for systematically assuring autonomous systems under complex, evolving conditions.

Original Abstract

Software engineering practices for validating autonomous cyber-physical systems (e.g., Uncrewed Aerial Vehicles) remain fragmented across scenario design, simulation execution, and telemetry analysis, limiting traceability between requirements, tests, and evidence. This fragmentation reduces reproducibility, slows debugging and iteration, and hinders systematic assurance under complex and evolving environmental conditions. We present AutonomyLens, an LLM-driven framework that integrates scenario specification, simulation execution, and telemetry analysis into a unified validation workflow. AutonomyLens enables developers to translate high-level validation intent into executable, temporally evolving scenarios, automatically run simulations, and perform context-aware analysis of resulting system behavior. The framework introduces (i) a structured representation for mission-level scenarios, (ii) an automated execution pipeline, (iii) analysis mechanisms that align telemetry with scenario context to produce actionable insights, and (iv) counterfactual scenario generation that closes the loop by refining and synthesizing new test cases from observed failures. We describe the early-stage design of AutonomyLens, discuss key challenges in building integrated validation workflows for autonomous systems, and outline how such an approach can improve traceability, reproducibility, and scalability in autonomy validation.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.