Efficient Listwise Reranking with Compressed Document Representations
Hervé Déjean, Stéphane Clinchant
TLDR
RRK is an efficient listwise reranker that compresses documents into fixed-size embeddings, achieving 3x-18x speedups over smaller models.
Key contributions
- Introduces RRK, an efficient listwise reranker using compressed document representations.
- Compresses documents into multi-token fixed-size embedding representations for efficiency.
- Achieves 3x-18x faster reranking than smaller models (0.6-4B) while maintaining effectiveness.
- Demonstrates superior efficiency, especially on long-document benchmarks.
Why it matters
Reranking with LLMs is computationally expensive. This paper introduces RRK, which significantly speeds up listwise reranking by compressing documents, making it 3x-18x faster than smaller models. This advancement is crucial for deploying effective and efficient information retrieval systems, especially for long documents.
Original Abstract
Reranking, the process of refining the output from a first-stage retriever, is often considered computationally expensive, especially when using Large Language Models (LLMs). A common approach to mitigate this cost involves utilizing smaller LLMs or controlling input length. Inspired by recent advances in document compression for retrieval-augmented generation (RAG), we introduce RRK, an efficient and effective listwise reranker compressing documents into multi-token fixed-size embedding representations. Our simple training via distillation shows that this combination of rich compressed representations and listwise reranking yields a highly efficient and effective system. In particular, our 8B-parameter model runs 3x-18x faster than smaller rerankers (0.6-4B parameters) while matching or outperforming them in effectiveness. The efficiency gains are even more striking on long-document benchmarks, where RRK widens its advantage further.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.