Dual-View Training for Instruction-Following Information Retrieval
Qingcheng Zeng, Puxuan Yu, Aman Mehta, Fuheng Zhao, Rajhans Samdani
TLDR
This paper introduces a dual-view data synthesis method using polarity reversal to train instruction-following information retrieval systems, significantly improving performance.
Key contributions
- Introduces dual-view data synthesis with polarity reversal for instruction-following information retrieval.
- Uses LLMs to generate complementary instructions, forcing retrievers to learn instruction sensitivity.
- Improves performance by 45% on FollowIR benchmark, outperforming larger general-purpose models.
- Shows data diversity preserves general retrieval quality, while instruction supervision boosts instruction awareness.
Why it matters
Current retrieval systems struggle with explicit user instructions. This paper offers a novel data synthesis approach that significantly enhances a retriever's ability to follow complex instructions. It highlights the importance of targeted data generation for building more intelligent and user-aware IR systems.
Original Abstract
Instruction-following information retrieval (IF-IR) studies retrieval systems that must not only find documents relevant to a query, but also obey explicit user constraints such as required attributes, exclusions, or output preferences. However, most retrievers are trained primarily for semantic relevance and often fail to distinguish documents that match the topic from those that satisfy the instruction. We propose a dual-view data synthesis strategy based on polarity reversal: given a query, a document that is relevant under the instruction, and a hard negative that matches the query but violates the instruction, we prompt an LLM to generate a complementary instruction under which the two documents swap relevance labels. By presenting the same document pair under complementary instructions that invert their relevance labels, the training signal forces the retriever to reconsider the same candidate set through the instruction, rather than relying on fixed topical cues. On a 305M-parameter encoder, our method improves performance on the FollowIR benchmark by 45%, surpassing general-purpose embedding models of comparable or larger scale. Through head-to-head comparisons at matched data budgets, we further show that data diversity and instruction supervision play complementary roles: the former preserves general retrieval quality, while the latter improves instruction sensitivity. These results highlight the value of targeted data synthesis for building retrieval systems that are both broadly capable and instruction-aware.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.