ArXiv TLDR

Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA

🐦 Tweet
2604.09019

Andre Bacellar

cs.IRcs.AIcs.CLcs.LG

TLDR

A new theory for two-hop QA retrieval regimes and a transferable router, RegimeRouter, significantly improves multi-hop question answering.

Key contributions

  • Formalizes two-hop QA retrieval into two regimes (Q-dominant, B-dominant) with three key theorems.
  • Identifies surface-text predicates that characterize retrieval regimes, crucial for effective routing.
  • Introduces RegimeRouter, a lightweight binary router using five text features for regime selection.
  • RegimeRouter shows significant zero-shot R@5 improvements (+5.6 pp, +5.3 pp) on multi-hop QA datasets.

Why it matters

This paper formalizes two-hop QA retrieval into distinct regimes. Its lightweight, transferable RegimeRouter significantly boosts performance across multi-hop QA datasets with robust zero-shot transfer.

Original Abstract

Two-hop QA retrieval splits queries into two regimes determined by whether the hop-2 entity is explicitly named in the question (Q-dominant) or only in the bridge passage (B-dominant). We formalize this split with three theorems: (T1) per-query AUC is a monotone function of the cosine separation margin, with R^2 >= 0.90 for six of eight type-encoder pairs; (T2) regime is characterized by two surface-text predicates, with P1 decisive for routing and P2 qualifying the B-dominant case, holding across three encoders and three datasets; and (T3) bridge advantage requires the relation-bearing sentence, not entity name alone, with removal causing an 8.6-14.1 pp performance drop (p < 0.001). Building on this theory, we propose RegimeRouter, a lightweight binary router that selects between question-only and question-plus-relation-sentence retrieval using five text features derived directly from the predicate definitions. Trained on 2WikiMultiHopQA (n = 881, 5-fold cross-fitted) and applied zero-shot to MuSiQue and HotpotQA, RegimeRouter achieves +5.6 pp (p < 0.001), +5.3 pp (p = 0.002), and +1.1 pp (non-significant, no-regret) R@5 improvement, respectively, with artifact-driven.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.