Agentic Witnessing: Pragmatic and Scalable TEE-Enabled Privacy-Preserving Auditing
TLDR
Agentic Witnessing uses TEEs and LLMs for privacy-preserving, qualitative auditing of proprietary data, enabling verification without disclosure.
Key contributions
- Proposes "Agentic Witnessing" for privacy-preserving, qualitative auditing of proprietary data.
- Uses an LLM-based Auditor within a TEE to enable attested reasoning on private datasets.
- Allows Verifiers to query private data via simple Boolean questions, receiving cryptographic transcripts.
- Successfully demonstrated by automating artifact evaluation for 21 software codebases.
Why it matters
The paper addresses the challenge of auditing qualitative properties of proprietary data without compromising confidentiality, a gap where traditional ZKPs fall short. Agentic Witnessing provides a pragmatic and scalable solution using TEEs and LLMs for attested reasoning. This enables crucial privacy-preserving oversight for complex systems and data governance.
Original Abstract
Auditing the semantic properties of proprietary data creates a fundamental tension: verification requires transparent access, while proprietary rights demand confidentiality. While Zero-Knowledge Proofs (ZKPs) ensure privacy, they are typically limited to precise algebraic constraints and are ill-suited for verifying qualitative, unstructured properties, such as the logic within a codebase. We propose {\em Agentic Witnessing}, a framework that moves verification from attested execution to {\em attested reasoning}. The system is composed of three agents: a Verifier (who wants to check properties of a dataset), a Prover (who owns the dataset) and an Auditor (that inspects the dataset). The Verifier is allowed to ask a limited number of simple binary true/false questions to the auditor. By isolating an LLM-based Auditor within a Trusted Execution Environment (TEE), the system enables the Verifier to query a Prover's private data via simple Boolean queries, without exposing the raw dataset. The Auditor uses the Model Context Protocol (MCP) to dynamically inspect the target dataset, producing a yes/no verdict accompanied by a cryptographic transcript: a signed hash chain binding the reasoning trace to both the original dataset and the TEE's hardware root of trust. We demonstrate this architecture by automating the artifact evaluation process for 21 peer-reviewed computer science papers with released codebases on GitHub (e.g. Does the codebase implement the system described in the paper?). We verified five high-level properties of these codebases described in the corresponding publications, treating the source code as private. Our results show that TEE-enabled agentic auditing provides a mechanism for privacy-preserving oversight, effectively decoupling qualitative verification from the need for data disclosure.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.