Ideological Bias in LLMs' Economic Causal Reasoning
Donggyu Lee, Hyeok Yun, Jungwon Kim, Junsik Min, Sungwon Park + 2 more
TLDR
LLMs exhibit systematic ideological bias in economic causal reasoning, performing better on intervention-oriented predictions than market-oriented ones.
Key contributions
- Extended EconCausal benchmark with 1,056 ideology-contested economic causal instances.
- LLMs are less accurate on contested economic questions compared to non-contested ones.
- 18 of 20 LLMs show higher accuracy when results align with intervention-oriented expectations.
- Incorrect predictions disproportionately lean intervention-oriented, even with one-shot prompting.
Why it matters
LLMs show a systematic ideological bias towards intervention-oriented economic views, making them less reliable for policy analysis. This bias persists even with prompting, underscoring the critical need for direction-aware evaluation in high-stakes economic and policy settings.
Original Abstract
Do large language models (LLMs) exhibit systematic ideological bias when reasoning about economic causal effects? As LLMs are increasingly used in policy analysis and economic reporting, where directionally correct causal judgments are essential, this question has direct practical stakes. We present a systematic evaluation by extending the EconCausal benchmark with ideology-contested cases - instances where intervention-oriented (pro-government) and market-oriented (pro-market) perspectives predict divergent causal signs. From 10,490 causal triplets (treatment-outcome pairs with empirically verified effect directions) derived from top-tier economics and finance journals, we identify 1,056 ideology-contested instances and evaluate 20 state-of-the-art LLMs on their ability to predict empirically supported causal directions. We find that ideology-contested items are consistently harder than non-contested ones, and that across 18 of 20 models, accuracy is systematically higher when the empirically verified causal sign aligns with intervention-oriented expectations than with market-oriented ones. Moreover, when models err, their incorrect predictions disproportionately lean intervention-oriented, and this directional skew is not eliminated by one-shot in-context prompting. These results highlight that LLMs are not only less accurate on ideologically contested economic questions, but systematically less reliable in one ideological direction than the other, underscoring the need for direction-aware evaluation in high-stakes economic and policy settings.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.