ArXiv TLDR

The Memory Curse: How Expanded Recall Erodes Cooperative Intent in LLM Agents

🐦 Tweet
2605.08060

Jiayuan Liu, Tianqin Li, Shiyi Du, Xin Luo, Haoxuan Zeng + 5 more

cs.CLcs.AIcs.GTcs.MA

TLDR

Expanded recall in LLMs can paradoxically degrade cooperation in multi-agent social dilemmas, a phenomenon termed the "memory curse."

Key contributions

  • Expanded recall degrades cooperation in 18/28 LLM-game settings across 7 LLMs and 4 games.
  • Breakdown linked to eroding forward-looking intent, not paranoia, validated by fine-tuning.
  • Memory content, not just length, triggers decay; sanitizing memory restores cooperation.
  • Chain-of-Thought reasoning paradoxically amplifies the "memory curse."

Why it matters

This paper challenges the assumption that larger context windows are always beneficial for LLMs, especially in multi-agent settings. It reveals that memory content and reasoning patterns critically influence cooperative behavior. Understanding this "memory curse" is vital for designing more robust and cooperative AI agents.

Original Abstract

Context window expansion is often treated as a straightforward capability upgrade for LLMs, but we find it systematically fails in multi-agent social dilemmas. Across 7 LLMs and 4 games over 500 rounds, expanding accessible history degrades cooperation in 18 of 28 model--game settings, a pattern we term the memory curse. We isolate the underlying mechanism through three analyses. First, lexical analysis of 378,000 reasoning traces associates this breakdown with eroding forward-looking intent rather than rising paranoia. We validate this using targeted fine-tuning as a cognitive probe: a LoRA adapter trained exclusively on forward-looking traces mitigates the decay and transfers zero-shot to distinct games. Second, memory sanitization holds prompt length fixed while replacing visible history with synthetic cooperative records, which restores cooperation substantially, proving the trigger is memory content, not length alone. Finally, ablating explicit Chain-of-Thought reasoning often reduces the collapse, showing that deliberation paradoxically amplifies the memory curse. Together, these results recast memory as an active determinant of multi-agent behavior: longer recall can either destabilize or support cooperation depending on the reasoning patterns it elicits.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.