When More Reformulations Hurt: Avoiding Drift using Ranker Feedback
V Venktesh, Mandeep Rathee, Avishek Anand
TLDR
ReformIR is a budget-aware retrieval framework that uses a teacher reranker to adaptively select query reformulations and documents, improving recall while avoiding drift.
Key contributions
- Introduces ReformIR, a budget-aware framework for adaptive query reformulation selection.
- Uses a strong reranker as a teacher to estimate document utility from reformulation-specific signals.
- Employs a lightweight surrogate model to prioritize reformulations and documents under a fixed reranking budget.
- Significantly improves recall and prevents drift, outperforming prior methods, especially with many reformulations.
Why it matters
Modern retrieval pipelines face a tradeoff between recall and query drift when using many reformulations. This paper offers a novel solution by adaptively selecting reformulations and documents under budget. It suggests a shift in design, leveraging LLMs more effectively for feedback-driven reformulation rather than just reranking.
Original Abstract
Modern retrieval pipelines increasingly rely on query reformulation and neural reranking to improve effectiveness, but this comes at a significant computational cost and introduces a fundamental tradeoff between recall and query drift. Generating many reformulated queries can substantially increase recall, yet naively merging or exhaustively reranking their results is prohibitively expensive. In this work, we argue that the core challenge is not reformulation generation itself, but the adaptive selection of reformulations and their retrieved documents under a strict inference budget. We propose ReformIR, a budget-aware retrieval framework that treats query reformulations as first-class features and performs online relevance estimation using a strong reranker as a teacher. Given multiple reformulated queries, ReformIR constructs a large candidate pool and learns a lightweight surrogate model that estimates document utility from reformulation-specific retrieval signals. Under a fixed reranking budget, the surrogate adaptively prioritizes both reformulations and documents, selectively querying a teacher reranker anchored to the original query. This process increases recall while actively suppressing drift through online feature selection over reformulations. We conduct extensive experiments on the MSMARCO passage corpora and TREC Deep Learning benchmarks (DL19-DL22). Our results show that ReformIR consistently outperforms existing reformulation strategies, particularly as the number of reformulations increases, where prior methods suffer from severe quality degradation due to drift. Our findings also suggest a shift in retrieval system design, rather than using large language models as rerankers, their capacity is more effectively leveraged in the reformulation stage with feedback-driven optimization.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.