ArXiv TLDR

Securing the Agent: Vendor-Neutral, Multitenant Enterprise Retrieval and Tool Use

🐦 Tweet
2605.05287

Francisco Javier Arceo, Varsha Prasad Narsing

cs.CRcs.AIcs.IRcs.SE

TLDR

This paper introduces a layered isolation architecture to secure multitenant enterprise RAG and agentic AI systems, preventing data leakage.

Key contributions

  • Formalizes the critical gap where RAG systems rank by relevance, not authorization, causing cross-tenant data leakage.
  • Proposes a layered isolation architecture with policy-aware ingestion and retrieval-time gating for secure multitenancy.
  • Enforces security through server-side agentic orchestration, centralizing authorization and state isolation.
  • Validates the architecture with an open-source implementation in OGX, demonstrating effective leakage prevention.

Why it matters

Enterprise AI deployments face significant security challenges, especially with multitenant data and strict access controls. This paper offers a crucial architectural solution that prevents data leakage in RAG and agentic systems. It ensures secure, compliant, and cost-effective AI operations for businesses.

Original Abstract

Retrieval-Augmented Generation (RAG) and agentic AI systems are increasingly prevalent in enterprise AI deployments. However, real enterprise environments introduce challenges largely absent from academic treatments and consumer-facing APIs: multiple tenants with heterogeneous data, strict access-control requirements, regulatory compliance, and cost pressures that demand shared infrastructure. A fundamental problem underlies existing RAG architectures in these settings: retrieval systems rank documents by relevance--whether through semantic similarity, keyword matching, or hybrid approaches--not by authorization, so a query from one tenant can surface another tenant's confidential data simply because it scores highest. We formalize this gap and analyze additional shortcomings--including tool-mediated disclosure, context accumulation across turns, and client-side orchestration bypass--that arise when agentic systems conflate relevance with authorization. To address these challenges, we introduce a layered isolation architecture combining policy-aware ingestion, retrieval-time gating, and shared inference, enforced through server-side agentic orchestration. This approach centralizes security-critical operations--tool execution authorization, state isolation, and policy enforcement--on the server, creating natural enforcement points for multitenant isolation while allowing client-side frameworks to retain control over agent composition and latency-sensitive operations. We validate the proposed architecture through an open-source implementation in OGX, a vendor-neutral framework that implements an OpenAI-compatible, open-source Responses API with server-side multi-turn orchestration. We evaluate it empirically and show that ABAC gating eliminates cross-tenant leakage while introducing negligible overhead.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.