ArXiv TLDR

Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Case Study

🐦 Tweet
2604.27464

Luyao Xu, Xiang Chen

cs.CRcs.AI

TLDR

This paper provides a layered review of security risks and defense strategies for LLM-based autonomous agent frameworks, using OpenClaw as a case study.

Key contributions

  • Presents a layered review of security risks and defenses in LLM-based autonomous agent frameworks.
  • Analyzes security across four layers: context, tool, state, and ecosystem, using OpenClaw as a case study.
  • Identifies how threats propagate across layers, from manipulated inputs to unsafe actions and persistent state contamination.
  • Highlights key challenges like research imbalance, long-horizon evaluation, and weak ecosystem trust models.

Why it matters

As LLM-based autonomous agents become more complex, understanding their unique security risks beyond traditional prompt-level vulnerabilities is crucial. This paper provides a timely, systematic, and layered analysis, offering a foundational understanding for developing robust defenses.

Original Abstract

Autonomous agent frameworks built upon large language models (LLMs) are evolving into complex, tool-integrated, and continuously operating systems, introducing security risks beyond traditional prompt-level vulnerabilities. As this paradigm is still at an early stage of development, a timely and systematic understanding of its security implications is increasingly important. Although a growing body of work has examined different attack surfaces and defense problems in agent systems, existing studies remain scattered across individual aspects of agent security, and there is still a lack of a layered review on this topic. To address this gap, this survey presents a layered review of security risks and defense strategies in autonomous agent frameworks, with OpenClaw as a case study. We organize the analysis into four security-relevant layers: the context and instruction layer, the tool and action layer, the state and persistence layer, and the ecosystem and automation layer. For each layer, we summarize its functional role, representative security risks, and corresponding defense strategies. Based on this layered analysis, we further identify that threats in autonomous agent frameworks may propagate across layers, from manipulated inputs to unsafe actions, persistent state contamination, and broader ecosystem-level impact. Finally, we highlight potential key challenges, including research imbalance across layers, the lack of long-horizon evaluation, and weak ecosystem trust models, and outline future directions toward more systematic and integrated defenses.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.