Caraman at SemEval-2026 Task 8: Three-Stage Multi-Turn Retrieval with Query Rewriting, Hybrid Search, and Cross-Encoder Reranking
David-Maximilian Caraman, Gheorghe Cosmin Silaghi
TLDR
This paper presents a three-stage multi-turn retrieval system using query rewriting, hybrid search, and cross-encoder reranking for SemEval-2026 Task 8.
Key contributions
- Developed a three-stage multi-turn retrieval system for SemEval-2026 Task 8 (MTRAGEval).
- Uses LoRA-fine-tuned Qwen 2.5 7B for context-dependent query rewriting.
- Employs hybrid BM25/dense retrieval with RRF and BGE-reranker-v2-m3 for reranking.
- Achieved 8th place (nDCG@5 0.531) out of 38 systems, 10.7% above baseline.
Why it matters
This paper presents a competitive multi-turn retrieval system, ranking high in SemEval-2026 Task 8. Its three-stage pipeline, including novel query rewriting and hybrid search, offers a robust approach. The finding on domain-specific temperature tuning provides valuable insights for future retrieval system development.
Original Abstract
We describe our system for SemEval-2026 Task 8 (MTRAGEval), participating in Task A (Retrieval) across four English-language domains. Our approach employs a three-stage pipeline: (1) query rewriting via a LoRA-fine-tuned Qwen 2.5 7B model that transforms context-dependent follow-up questions into standalone queries, (2) hybrid BM25 and dense retrieval combined through Reciprocal Rank Fusion, and (3) cross-encoder reranking with BGE-reranker-v2-m3. On the official test set, the system achieves nDCG@5 of 0.531, ranking 8th out of 38 participating systems and 10.7% above the organizer baseline. Development comparisons reveal that domain-specific temperature tuning for query generation, where technical domains benefit from deterministic decoding and general domains from controlled randomness, provides consistent gains, while more complex strategies such as domain-aware prompting and multi-query expansion degrade performance.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.