ArXiv TLDR

Sustaining Cooperation in Populations Guided by AI: A Folk Theorem for LLMs

🐦 Tweet
2605.06525

Jonathan Shaki, Eden Hartman, Sarit Kraus, Yonatan Aumann

cs.GTcs.MAecon.TH

TLDR

This paper proves a folk theorem for LLMs, showing how shared AI guidance can sustain cooperation among agents with misaligned incentives.

Key contributions

  • LLMs guiding interacting agents create a strategic "meta-game" among the LLMs themselves.
  • In one-shot games, shared LLM instructions can induce cooperation if an LLM influences multiple roles.
  • Proves a "folk theorem for LLMs" in repeated settings, sustaining cooperation despite indirect observation.
  • Demonstrates that all feasible and individually rational outcomes can be sustained as ε-equilibria.

Why it matters

This paper is crucial for understanding how AI guidance impacts multi-agent systems. It demonstrates that LLMs can foster cooperation even when agents have conflicting interests. This has significant implications for designing robust and cooperative AI-driven ecosystems.

Original Abstract

Large language models (LLMs) are increasingly used to provide instructions to many agents who interact with one another. Such shared reliance couples agents who appear to act independently: they may in fact be guided by a common model. This coupling can change the prospects for cooperation among agents with misaligned incentives. We study settings in which multiple LLMs each advise a population of clients who participate in instances of an underlying game, creating strategic interaction at the level of the LLMs themselves. This induces a meta-game among the LLMs, mediated through clients. We first analyze the one-shot setting, where shared instructions can change equilibrium behavior only when an LLM may influence more than one role in the same interaction; in such cases, cooperation may emerge, and the effect of client share can be beneficial, harmful, or non-monotone, depending on the base game. Our main result concerns the repeated setting. We prove a folk theorem for LLMs: despite indirect observation and the clients' inability to identify which LLM advised their opponents, all feasible and individually rational outcomes can be sustained as $\varepsilon$-equilibria. The result does not follow from the standard folk theorem and requires new proof techniques. Together, these results show that shared LLM guidance can sustain cooperation among populations of agents even when the underlying incentives are misaligned.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.