Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

April 9, 20262604.08304

Yuming Xu, Mingtao Zhang, Zhuohan Ge, Haoyang Li, Nicole Hu + 3 more

cs.CRcs.AI

TLDR

This paper taxonomizes RAG security, distinguishing LLM-inherent risks from RAG-specific threats, and reviews attacks, defenses, and future directions.

Key contributions

Defines secure RAG by focusing on the external knowledge-access pipeline, separating LLM-inherent risks.
Classifies RAG attacks across four surfaces: knowledge corruption, retrieval manipulation, context exploitation, and exfiltration.
Systematically reviews current RAG attacks, defenses, and benchmarks, highlighting their reactive nature.
Identifies gaps and proposes future directions for layered, boundary-aware RAG security.

Why it matters

RAG introduces novel security risks beyond LLM flaws. This paper offers a vital taxonomy of RAG-specific threats, reviewing reactive defenses and outlining future directions for proactive, layered security in robust RAG systems.

Original Abstract

Retrieval-augmented generation (RAG) significantly enhances large language models (LLMs) but introduces novel security risks through external knowledge access. While existing studies cover various RAG vulnerabilities, they often conflate inherent LLM risks with those specifically introduced by RAG. In this paper, we propose that secure RAG is fundamentally about the security of the external knowledge-access pipeline. We establish an operational boundary to separate inherent LLM flaws from RAG-introduced or RAG-amplified threats. Guided by this perspective, we abstract the RAG workflow into six stages and organize the literature around three trust boundaries and four primary security surfaces, including pre-retrieval knowledge corruption, retrieval-time access manipulation, downstream context exploitation, and knowledge exfiltration. By systematically reviewing the corresponding attacks, defenses, remediation mechanisms, and evaluation benchmarks, we reveal that current defenses remain largely reactive and fragmented. Finally, we discuss these gaps and highlight future directions toward layered, boundary-aware protection across the entire knowledge-access lifecycle.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers