ArXiv TLDR

Beyond Factual Grounding: The Case for Opinion-Aware Retrieval-Augmented Generation

🐦 Tweet
2604.12138

Aditya Agrawal, Alwarappan Nakkiran, Darshan Fofadiya, Alex Karlsson, Harsha Aduri

cs.AIcs.CLcs.IR

TLDR

Opinion-Aware RAG addresses factual bias in RAG by explicitly handling subjective content, improving retrieval diversity and representing diverse perspectives.

Key contributions

  • Formalizes factual bias in RAG, distinguishing epistemic (factual) from aleatoric (opinion) uncertainty.
  • Introduces Opinion-Aware RAG architecture with LLM-based opinion extraction and entity-linked opinion graphs.
  • Achieves substantial improvements in retrieval diversity: +26.8% sentiment, +31.6% author coverage.

Why it matters

Current RAG systems' factual bias limits their real-world utility and risks echo chambers. This paper offers a critical step towards transparent, accountable AI by enabling RAG to effectively process subjective content. It opens new applications in social media analysis and product reviews.

Original Abstract

RAG systems have transformed how LLMs access external knowledge, but we find that current implementations exhibit a bias toward factual, objective content, as evidenced by existing benchmarks and datasets that prioritize objective retrieval. This factual bias - treating opinions and diverse perspectives as noise rather than information to be synthesized - limits RAG systems in real-world scenarios involving subjective content, from social media discussions to product reviews. Beyond technical limitations, this bias poses risks to transparent and accountable AI: echo chamber effects that amplify dominant viewpoints, systematic underrepresentation of minority voices, and potential opinion manipulation through biased information synthesis. We formalize this limitation through the lens of uncertainty: factual queries involve epistemic uncertainty reducible through evidence, while opinion queries involve aleatoric uncertainty reflecting genuine heterogeneity in human perspectives. This distinction implies that factual RAG should minimize posterior entropy, whereas opinion-aware RAG must preserve it. Building on this theoretical foundation, we present an Opinion-Aware RAG architecture featuring LLM-based opinion extraction, entity-linked opinion graphs, and opinion-enriched document indexing. We evaluate our approach on e-commerce seller forum data, comparing an Opinion-Enriched knowledge base against a traditional baseline. Experiments demonstrate substantial improvements in retrieval diversity: +26.8% sentiment diversity, +42.7% entity match rate, and +31.6% author demographic coverage on entity-matched documents. Our results provide empirical evidence that treating subjectivity as a first-class citizen yields measurably more representative retrieval-a first step toward opinion-aware RAG. Future work includes joint optimization of retrieval and generation for distributional fidelity.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.