ArXiv TLDR

CORAL: Adaptive Retrieval Loop for Culturally-Aligned Multilingual RAG

🐦 Tweet
2604.25676

Nayeon Lee, Jiwoo Song, Byeongcheol Kang

cs.CLcs.AI

TLDR

CORAL improves multilingual RAG for culturally-grounded queries by adaptively refining retrieval space and queries, boosting accuracy.

Key contributions

  • Introduces CORAL, an adaptive retrieval loop for multilingual RAG to handle cultural queries.
  • Iteratively refines retrieval corpora and query probes based on evidence quality and cultural alignment.
  • Critiques retrieved evidence for both relevance and cultural alignment before answering.
  • Boosts accuracy by up to 3.58% on low-resource languages in cultural QA benchmarks.

Why it matters

This paper addresses a critical gap in multilingual RAG, where cultural nuances are often overlooked. By introducing an adaptive retrieval loop, CORAL ensures that answers are not only accurate but also culturally appropriate. This is crucial for deploying RAG systems globally, especially in diverse linguistic and cultural contexts.

Original Abstract

Multilingual retrieval-augmented generation (mRAG) is often implemented within a fixed retrieval space, typically via query or document translation or multilingual embedding vector representations. However, this approach may be inadequate for culturally grounded queries, in which retrieval-condition misalignment may occur. Even strong retrievers and generators may struggle to produce culturally relevant answers when sourcing evidence from inappropriate linguistic or regional contexts. To this end, we introduce CORAL (COntext-aware Retrieval with Agentic Loop, an adaptive retrieval methodology for mRAG that enables iterative refinement of both the retrieval space (corpora) and the retrieval probe (query) based on the quality of the evidence. The overall process includes: (1) selecting corpora, (2) retrieving documents, (3) critiquing evidence for relevance and cultural alignment, and (4) checking sufficiency. If the retrieved documents are insufficient to answer the query correctly, the system (5) reselects corpora and rewrites the query. Across two cultural QA benchmarks, CORAL achieves up to a 3.58%p accuracy improvement on low-resource languages relative to the strongest baselines.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.