Securing the Agent: Vendor-Neutral, Multitenant Enterprise Retrieval and Tool Use

May 6, 20262605.05287

Francisco Javier Arceo, Varsha Prasad Narsing

cs.CRcs.AIcs.IRcs.SE

TLDR

This paper introduces a layered isolation architecture to secure multitenant enterprise RAG and agentic AI systems, preventing data leakage.

Key contributions

Formalizes the critical gap where RAG systems rank by relevance, not authorization, causing cross-tenant data leakage.
Proposes a layered isolation architecture with policy-aware ingestion and retrieval-time gating for secure multitenancy.
Enforces security through server-side agentic orchestration, centralizing authorization and state isolation.
Validates the architecture with an open-source implementation in OGX, demonstrating effective leakage prevention.

Why it matters

Enterprise AI deployments face significant security challenges, especially with multitenant data and strict access controls. This paper offers a crucial architectural solution that prevents data leakage in RAG and agentic systems. It ensures secure, compliant, and cost-effective AI operations for businesses.

Original Abstract

Retrieval-Augmented Generation (RAG) and agentic AI systems are increasingly prevalent in enterprise AI deployments. However, real enterprise environments introduce challenges largely absent from academic treatments and consumer-facing APIs: multiple tenants with heterogeneous data, strict access-control requirements, regulatory compliance, and cost pressures that demand shared infrastructure. A fundamental problem underlies existing RAG architectures in these settings: retrieval systems rank documents by relevance--whether through semantic similarity, keyword matching, or hybrid approaches--not by authorization, so a query from one tenant can surface another tenant's confidential data simply because it scores highest. We formalize this gap and analyze additional shortcomings--including tool-mediated disclosure, context accumulation across turns, and client-side orchestration bypass--that arise when agentic systems conflate relevance with authorization. To address these challenges, we introduce a layered isolation architecture combining policy-aware ingestion, retrieval-time gating, and shared inference, enforced through server-side agentic orchestration. This approach centralizes security-critical operations--tool execution authorization, state isolation, and policy enforcement--on the server, creating natural enforcement points for multitenant isolation while allowing client-side frameworks to retain control over agent composition and latency-sensitive operations. We validate the proposed architecture through an open-source implementation in OGX, a vendor-neutral framework that implements an OpenAI-compatible, open-source Responses API with server-side multi-turn orchestration. We evaluate it empirically and show that ABAC gating eliminates cross-tenant leakage while introducing negligible overhead.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers