S2G-RAG: Structured Sufficiency and Gap Judging for Iterative Retrieval-Augmented QA
Minghan Li, Junjie Zou, Xinxuan Lv, Chao Zhang, Guodong Zhou
TLDR
S2G-RAG improves multi-hop QA by using a judge to identify missing information and guide iterative retrieval, reducing noise and enhancing robustness.
Key contributions
- S2G-Judge predicts evidence sufficiency and generates structured "gap items" for missing info.
- Maps gap items to retrieval queries, enabling stable multi-turn retrieval trajectories.
- Reduces noise accumulation by maintaining a compact, sentence-level Evidence Context.
- Improves multi-hop QA performance and robustness on TriviaQA, HotpotQA, and 2WikiMultiHopQA.
Why it matters
RAG systems often struggle with complex multi-hop questions, leading to incomplete or noisy answers. S2G-RAG addresses this by intelligently guiding iterative retrieval. Its explicit judging mechanism and noise reduction make it a significant step towards more robust and accurate QA systems.
Original Abstract
Retrieval-Augmented Generation (RAG) grounds language models in external evidence, but multi-hop question answering remains difficult because iterative pipelines must control what to retrieve next and when the available evidence is adequate. In practice, systems may answer from incomplete evidence chains, or they may accumulate redundant or distractor-heavy text that interferes with later retrieval and reasoning. We propose S2G-RAG (Structured Sufficiency and Gap-judging RAG), an iterative framework with an explicit controller, S2G-Judge. At each turn, S2G-Judge predicts whether the current evidence memory supports answering and, if not, outputs structured gap items that describe the missing information. These gap items are then mapped into the next retrieval query, producing stable multi-turn retrieval trajectories. To reduce noise accumulation, S2G-RAG maintains a sentence-level Evidence Context by extracting a compact set of relevant sentences from retrieved documents. Experiments on TriviaQA, HotpotQA, and 2WikiMultiHopQA show that S2G-RAG improves multi-hop QA performance and robustness under multi-turn retrieval. Furthermore, S2G-RAG can be integrated into existing RAG pipelines as a lightweight component, without modifying the search engine or retraining the generator.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.