Bayesian Active Learning with Gaussian Processes Guided by LLM Relevance Scoring for Dense Passage Retrieval
Junyoung Kim, Anton Korikov, Jiazhou Liang, Justin Cui, Yifan Simon Liu + 3 more
TLDR
BAGEL uses Bayesian active learning with Gaussian Processes and LLM relevance to efficiently explore and retrieve relevant passages, outperforming reranking.
Key contributions
- Proposes BAGEL, a novel framework for budget-constrained dense passage retrieval using LLM relevance.
- Models multimodal relevance distribution across embedding space with a query-specific Gaussian Process.
- Iteratively selects passages, balancing exploitation of high-confidence regions with exploration of uncertain areas.
- Outperforms LLM reranking methods on four benchmark datasets under the same LLM budget.
Why it matters
LLMs are powerful but costly for retrieval. This paper introduces an efficient active learning approach that leverages LLMs without breaking the bank. It addresses key limitations of passive retrieval by intelligently exploring the embedding space, making LLM-powered retrieval more practical and effective.
Original Abstract
While Large Language Models (LLMs) exhibit exceptional zero-shot relevance modeling, their high computational cost necessitates framing passage retrieval as a budget-constrained global optimization problem. Existing approaches passively rely on first-stage dense retrievers, which leads to two limitations: (1) failing to retrieve relevant passages in semantically distinct clusters, and (2) failing to propagate relevance signals to the broader corpus. To address these limitations, we propose Bayesian Active Learning with Gaussian Processes guided by LLM relevance scoring (BAGEL), a novel framework that propagates sparse LLM relevance signals across the embedding space to guide global exploration. BAGEL models the multimodal relevance distribution across the entire embedding space with a query-specific Gaussian Process (GP) based on LLM relevance scores. Subsequently, it iteratively selects passages for scoring by strategically balancing the exploitation of high-confidence regions with the exploration of uncertain areas. Extensive experiments across four benchmark datasets and two LLM backbones demonstrate that BAGEL effectively explores and captures complex relevance distributions and outperforms LLM reranking methods under the same LLM budget on all four datasets.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.