Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation
Di Lu, Bo Zhang, Xiyuan Li, Yongzhi Liao, Xuewen Dong + 3 more
TLDR
This paper proposes TEE-backed isolation to constrain host-level abuse in self-hosted computer-use agents, preventing unsafe operations.
Key contributions
- Addresses host-level abuse in self-hosted agents (SHCUAs) due to malicious steering.
- Introduces an operation-centric model for risk-based confinement using TEEs.
- Protects critical decisions (classification, authorization) within a TEE-backed trusted plane.
- Instantiates the architecture on OpenClaw using Intel TDX for trusted execution.
Why it matters
Self-hosted agents offer powerful automation but pose significant security risks from host-level abuse. This work provides a robust, TEE-backed solution to mitigate these threats, ensuring agents operate safely and within policy. It's crucial for securing the growing ecosystem of AI agents with direct system access.
Original Abstract
Self-hosted computer-use agents (SHCUAs), such as OpenClaw, combine natural-language interaction with direct access to host-side resources, including browsers, files, scripts, system commands, and external communication channels. While useful for automating real tasks, this capability also creates a host-level abuse surface: a legitimately deployed agent may be steered toward unsafe operations through malicious messages, indirect prompt injection, unsafe skills, or tampering along the host-side control path. We argue that such risks cannot be addressed by ad hoc blocking rules alone, because the security criticality of an operation depends jointly on its action type, target object, execution context, and potential effect. This paper presents an operation-centric model for risk-based confinement of SHCUA operations. The proposed design keeps ordinary functionality on the constrained REE path, while protecting security-critical classification, authorization, binding, evidence generation, and selected execution-control decisions inside a cloud-native TEE-backed trusted operation plane. We instantiate the architecture on OpenClaw using Intel TDX as the primary trusted backend, with remote terminal-side trusted components verifying TDX-audited commands before constrained local execution. The evaluation shows that the design can block unsafe or policy-disallowed operations before execution, preserve ordinary functionality for allowed workloads, and provide auditable evidence with deployment-dependent overhead.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.