CoopEval: Benchmarking Cooperation-Sustaining Mechanisms and LLM Agents in Social Dilemmas
Emanuel Tewolde, Xiao Zhang, David Guzman Piedrahita, Vincent Conitzer, Zhijing Jin
TLDR
CoopEval benchmarks game-theoretic mechanisms to foster cooperation in LLM agents, finding contracts and mediation most effective in social dilemmas.
Key contributions
- LLM agents, even with strong reasoning, consistently defect in single-shot social dilemmas.
- CoopEval evaluates four cooperation-sustaining mechanisms: repetition, reputation, mediation, and contracts.
- Contracting and third-party mediation are most effective at achieving cooperation between LLM models.
- Repetition-induced cooperation significantly deteriorates when co-players vary.
Why it matters
LLM agents' lack of cooperation in social dilemmas is a critical safety concern for multi-agent systems. This paper provides crucial insights into designing effective mechanisms to promote cooperative behavior among LLMs. The findings, especially on contracts and mediation, are vital for developing more reliable and safe AI interactions.
Original Abstract
It is increasingly important that LLM agents interact effectively and safely with other goal-pursuing agents, yet, recent works report the opposite trend: LLMs with stronger reasoning capabilities behave _less_ cooperatively in mixed-motive games such as the prisoner's dilemma and public goods settings. Indeed, our experiments show that recent models -- with or without reasoning enabled -- consistently defect in single-shot social dilemmas. To tackle this safety concern, we present the first comparative study of game-theoretic mechanisms that are designed to enable cooperative outcomes between rational agents _in equilibrium_. Across four social dilemmas testing distinct components of robust cooperation, we evaluate the following mechanisms: (1) repeating the game for many rounds, (2) reputation systems, (3) third-party mediators to delegate decision making to, and (4) contract agreements for outcome-conditional payments between players. Among our findings, we establish that contracting and mediation are most effective in achieving cooperative outcomes between capable LLM models, and that repetition-induced cooperation deteriorates drastically when co-players vary. Moreover, we demonstrate that these cooperation mechanisms become _more effective_ under evolutionary pressures to maximize individual payoffs.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.