AgenTEE: Confidential LLM Agent Execution on Edge Devices
Sina Abdollahi, Mohammad M Maheri, Javad Forough, Amir Al Sadi, Josh Millar + 3 more
TLDR
AgenTEE enables secure, confidential execution of LLM agents on edge devices by isolating components in attested confidential VMs with minimal performance overhead.
Key contributions
- Deploys confidential LLM agent pipelines securely on edge devices.
- Isolates agent runtime, inference, and third-party apps in independently attested cVMs.
- Mediates cVM interaction via explicit, verifiable communication channels.
- Achieves near-native performance with less than 5.15% runtime overhead.
Why it matters
LLM agents on edge devices offer privacy and low latency but pose significant security risks to sensitive assets. AgenTEE provides a practical system to secure these complex pipelines using confidential VMs, enabling safer deployment. This advancement supports the broader adoption of powerful LLM agents in privacy-sensitive and resource-constrained environments.
Original Abstract
Large Language Model (LLM) agents provide powerful automation capabilities, but they also create a substantially broader attack surface than traditional applications due to their tight integration with non-deterministic models and third-party services. While current deployments primarily rely on cloud-hosted services, emerging designs increasingly execute agents directly on edge devices to reduce latency and enhance user privacy. However, securely hosting such complex agent pipelines on edge devices remains challenging. These deployments must protect proprietary assets (e.g., system prompts and model weights) and sensitive runtime state on heterogeneous platforms that are vulnerable to software attacks and potentially controlled by malicious users. To address these challenges, we present AgenTEE, a system for deploying confidential agent pipelines on edge devices. AgenTEE places the agent runtime, inference engine, and third-party applications into independently attested confidential virtual machines (cVMs) and mediates their interaction through explicit, verifiable communication channels. Built on Arm Confidential Compute Architecture (CCA), a recent extension to Arm platforms, AgenTEE enforces strong system-level isolation of sensitive assets and runtime state. Our evaluation shows that such multi-cVMs system is practical, achieving near-native performance with less than 5.15% runtime overhead compared to commodity OS multi-process deployments.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.