ArXiv TLDR

Why are all LLMs Obsessed with Japanese Culture? On the Hidden Cultural and Regional Biases of LLMs

🐦 Tweet
2604.21751

Joseba Fernandez de Landa, Carla Perez-Almendros, Jose Camacho-Collados

cs.CLcs.AIcs.CY

TLDR

Study reveals LLMs exhibit a surprising bias towards Japanese culture, particularly after fine-tuning, challenging previous assumptions about Western-centricity.

Key contributions

  • Introduces CROQ, a new dataset for evaluating LLM cultural and regional biases.
  • Discovers LLMs show a clear, unexpected preference for Japanese culture, contrary to prior Western-bias findings.
  • Shows high-resource language prompts (e.g., English) yield more diverse cultural outputs.
  • Identifies that this cultural bias emerges primarily after supervised fine-tuning, not during pre-training.

Why it matters

This paper challenges the prevailing view of LLM cultural biases, revealing an unexpected preference for Japanese culture. Understanding when this bias emerges (post-fine-tuning) is crucial for developing more culturally competent AI. It highlights the need for diverse training and careful fine-tuning to mitigate regional inclinations.

Original Abstract

LLMs have been showing limitations when it comes to cultural coverage and competence, and in some cases show regional biases such as amplifying Western and Anglocentric viewpoints. While there have been works analysing the cultural capabilities of LLMs, there has not been specific work on highlighting LLM regional preferences when it comes to cultural-related questions. In this work, we propose a new dataset based on a comprehensive taxonomy of Culture-Related Open Questions (CROQ). The results show that, contrary to previous cultural bias work, LLMs show a clear tendency towards countries such as Japan. Moveover, our results show that when prompting in languages such as English or other high-resource ones, LLMs tend to provide more diverse outputs and show less inclinations towards answering questions highlighting countries for which the input language is an official language. Finally, we also investigate at which point of LLM training this cultural bias emerges, with our results suggesting that the first clear signs appear after supervised fine-tuning, and not during pre-training.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.