AgentWard: A Lifecycle Security Architecture for Autonomous AI Agents
Yixiang Zhang, Xinhao Deng, Jiaqing Wu, Yue Xiao, Ke Xu + 1 more
TLDR
AgentWard is a lifecycle security architecture for autonomous AI agents, providing defense-in-depth across five stages to prevent threat propagation.
Key contributions
- Organizes defense-in-depth across five agent lifecycle stages: init, input, memory, decision, execution.
- Integrates stage-specific, heterogeneous controls with cross-layer coordination to intercept threats.
- Safeguards critical assets and provides a blueprint for runtime security and trust management.
- Implemented a plugin-native prototype on OpenClaw, demonstrating practical feasibility.
Why it matters
Autonomous AI agents face complex security challenges where failures propagate across their lifecycle. AgentWard offers a systematic, defense-in-depth architecture to intercept threats and safeguard assets. This is crucial for building robust, secure AI agents and preventing harmful real-world effects.
Original Abstract
Autonomous AI agents extend large language models into full runtime systems that load skills, ingest external content, maintain memory, plan multi-step actions, and invoke privileged tools. In such systems, security failures rarely remain confined to a single interface; instead, they can propagate across initialization, input processing, memory, decision-making, and execution, often becoming apparent only when harmful effects materialize in the environment. This paper presents AgentWard, a lifecycle-oriented, defense-in-depth architecture that systematically organizes protection across these five stages. AgentWard integrates stage-specific, heterogeneous controls with cross-layer coordination, enabling threats to be intercepted along their propagation paths while safeguarding critical assets. We detail the design rationale and architecture of five coordinated protection layers, and implement a plugin-native prototype on OpenClaw to demonstrate practical feasibility. This perspective provides a concrete blueprint for structuring runtime security controls, managing trust propagation, and enforcing execution containment in autonomous AI agents. Our code is available at https://github.com/FIND-Lab/AgentWard .
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.