Language Bias under Conflicting Information in Multilingual LLMs
Robert Östling, Murathan Kurfalı
TLDR
Multilingual LLMs exhibit a consistent language bias, often ignoring conflicting information and favoring certain languages like Chinese over Russian.
Key contributions
- Multilingual LLMs consistently ignore conflicting information, confidently asserting only one answer.
- A consistent language bias exists across models, preferring certain languages when information conflicts.
- LLMs show a general bias against Russian and, for long contexts, a bias in favor of Chinese.
- These language biases are observed in models trained both inside and outside mainland China.
Why it matters
This research reveals critical language biases in multilingual LLMs, highlighting potential fairness and reliability issues when models process conflicting information. Understanding these biases is crucial for developing more robust and equitable AI systems, especially in diverse linguistic contexts.
Original Abstract
Large Language Models (LLMs) have been shown to contain biases in the process of integrating conflicting information when answering questions. Here we ask whether such biases also exist with respect to which language is used for each conflicting piece of information. To answer this question, we extend the conflicting needles in a haystack paradigm to a multilingual setting and perform a comprehensive set of evaluations with naturalistic news domain data in five different languages, for a range of multilingual LLMs of different sizes. We find that all LLMs tested, including GPT-5.2, ignore the conflict and confidently assert only one of the possible answers in the large majority of cases. Furthermore, there is a consistent bias across models in which languages are preferred, with a general bias against Russian and, for the longest context lengths, in favor of Chinese. Both of these patterns are consistent between models trained inside and outside of mainland China, though somewhat stronger in the former category.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.