PRAG End-to-End Privacy-Preserving Retrieval-Augmented Generation

April 29, 20262604.26525

Zhijun Li, Minghui Xu, Huayi Qi, Wenxuan Yu, Tingchuang Zhang + 4 more

cs.CR

TLDR

PRAG offers end-to-end privacy for RAG in the cloud, ensuring high retrieval quality and scalability without exposing sensitive data.

Key contributions

End-to-end privacy for RAG documents and queries in cloud environments.
Dual-mode architecture: PRAG-I for low-latency, PRAG-II for high accuracy.
Operation-Error Estimation (OEE) stabilizes ranking against homomorphic noise.
Delivers competitive recall (72-74%) and practical latency with strong attack resilience.

Why it matters

Cloud RAG exposes sensitive data, often sacrificing performance or quality. PRAG offers end-to-end privacy-preserving RAG, maintaining high retrieval quality and scalability, proving secure, high-performance RAG is feasible at scale for LLMs.

Original Abstract

Retrieval-Augmented Generation (RAG) is essential for enhancing Large Language Models (LLMs) with external knowledge, but its reliance on cloud environments exposes sensitive data to privacy risks. Existing privacy-preserving solutions often sacrifice retrieval quality due to noise injection or only provide partial encryption. We propose PRAG, an end-to-end privacy-preserving RAG system that achieves end-to-end confidentiality for both documents and queries without sacrificing the scalability of cloud-hosted RAG. PRAG features a dual-mode architecture: a non-interactive PRAG-I utilizes homomorphic-friendly approximations for low-latency retrieval, while an interactive PRAG-II leverages client assistance to match the accuracy of non-private RAG. To ensure robust semantic ordering, we introduce Operation-Error Estimation (OEE), a mechanism that stabilizes ranking against homomorphic noise. Experiments on large-scale datasets demonstrate that PRAG achieves competitive recall (72.45%-74.45%), practical retrieval latency, and strong resilience against graph reconstruction attacks while maintaining end-to-end confidentiality. This work confirms the feasibility of secure, high-performance RAG at scale.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers