NaviRAG: Towards Active Knowledge Navigation for Retrieval-Augmented Generation
Jihao Dai, Dingjun Wu, Yuxuan Chen, Zheni Zeng, Yukun Yan + 2 more
TLDR
NaviRAG enhances RAG by actively navigating hierarchical knowledge with an LLM agent, improving multi-granular retrieval and QA performance.
Key contributions
- Introduces NaviRAG, an active knowledge navigation framework for RAG.
- Structures knowledge into a hierarchical form, preserving semantic relationships.
- LLM agent iteratively identifies information gaps and retrieves multi-granular content.
- Consistently improves retrieval recall and end-to-end QA over RAG baselines.
Why it matters
Traditional RAG struggles with complex tasks requiring dynamic information synthesis. NaviRAG addresses this by enabling LLMs to actively navigate hierarchical knowledge, leading to more intelligent and autonomous RAG systems. This improves performance on long-document QA by localizing multi-granular evidence.
Original Abstract
Retrieval-augmented generation (RAG) typically relies on a flat retrieval paradigm that maps queries directly to static, isolated text segments. This approach struggles with more complex tasks that require the conditional retrieval and dynamic synthesis of information across different levels of granularity (e.g., from broad concepts to specific evidence). To bridge this gap, we introduce NaviRAG, a novel framework that shifts from passive segment retrieval to active knowledge navigation. NaviRAG first structures the knowledge documents into a hierarchical form, preserving semantic relationships from coarse-grained topics to fine-grained details. Leveraging this reorganized knowledge records, a large language model (LLM) agent actively navigates the records, iteratively identifying information gaps and retrieving relevant content from the most appropriate granularity level. Extensive experiments on long-document QA benchmarks show that NaviRAG consistently improves both retrieval recall and end-to-end answer performance over conventional RAG baselines. Ablation studies confirm performance gains stem from our method's capacity for multi-granular evidence localization and dynamic retrieval planning. We further discuss efficiency, applicable scenario, and future directions of our method, hoping to make RAG systems more intelligent and autonomous.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.