RAQG-QPP: Query Performance Prediction with Retrieved Query Variants and Retrieval Augmented Query Generation
Fangzheng Tian, Debasis Ganguly, Craig Macdonald
TLDR
RAQG-QPP improves Query Performance Prediction for neural rankers using retrieved and LLM-generated query variants, boosting accuracy by up to 30%.
Key contributions
- Proposes using retrieved queries from past logs as more coherent query variants (QVs) for QPP.
- Introduces Retrieval Augmented Query Generation (RAQG) to generate novel QVs using LLMs conditioned on retrieved QVs.
- Achieves up to 30% higher Query Performance Prediction accuracy for neural rankers like MonoT5.
Why it matters
This paper addresses the challenge of accurately predicting retrieval quality for neural rankers, which is crucial for improving overall search effectiveness. By introducing a robust method for generating high-quality query variants, it significantly advances unsupervised QPP, offering practical benefits for real-world search systems.
Original Abstract
Query Performance Prediction (QPP) estimates the retrieval quality of ranking models without the use of any human-assessed relevance judgements, and finds applications in query-specific selective decision making to improve overall retrieval effectiveness. Although unsupervised QPP approaches are effective for lexical retrieval models, they usually perform weaker for neural rankers. Recent work shows that leveraging query variants (QVs), i.e., queries with potentially similar information needs to a given query, can enhance unsupervised QPP accuracy. However, existing QV-based prediction methods rely on query variants generated by term expansion of the input query, which is likely to yield incoherent, hallucinatory and off-topic QVs. In this paper, we propose to make use of queries retrieved from a log of past queries as QVs to be subsequently used for QPP. In addition to directly applying retrieved QVs in QPP, we further propose to leverage large language models (LLMs) to generate QVs conditioned on the retrieved QVs, thus mitigating the limitation of relying only on existing queries in a log. Experiments on TREC DL'19 and DL'20 show that QPP enhanced with RAQG outperform the best-performing existing QV-based prediction approach by as much as 30% on neural ranking models such as MonoT5.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.