ArXiv TLDR

Measuring Successful Cooperation in Human-AI Teamwork: Development and Validation of the Perceived Cooperativity and Teaming Perception Scales

🐦 Tweet
2604.24461

Christiane Attig, Christiane Wiebel-Herboth, Patricia Wollstadt, Tim Schrills, Mourad Zoubir + 1 more

cs.HCcs.AI

TLDR

This paper introduces and validates two new scales, PCS and TPS, for reliably measuring subjective cooperation quality in human-AI and human-human teams.

Key contributions

  • Developed the Perceived Cooperativity Scale (PCS) based on joint activity theory.
  • Developed the Teaming Perception Scale (TPS) grounded in evolutionary cooperation theory.
  • Validated both scales across three diverse studies, including LLM interaction and decision-support systems.
  • Scales successfully differentiate cooperative quality in human-AI and human-human interactions.

Why it matters

As human-AI collaboration grows, robust tools are essential for evaluating interaction quality. These validated scales provide a critical foundation for empirical research and system design, enabling better assessment of cooperative success. This advances our ability to build more effective human-AI teams.

Original Abstract

As human-AI cooperation becomes increasingly prevalent, reliable instruments for assessing the subjective quality of cooperative human-AI interaction are needed. We introduce two theoretically grounded scales: the Perceived Cooperativity Scale (PCS), grounded in joint activity theory, and the Teaming Perception Scale (TPS), grounded in evolutionary cooperation theory. The PCS captures an agent's perceived cooperative capability and practice within a single interaction sequence; the TPS captures the emergent sense of teaming arising from mutual contribution and support. Both scales were adapted for human-human cooperation to enable cross-agent comparisons. Across three studies (N = 409) encompassing a cooperative card game, LLM interaction, and a decision-support system, analyses of dimensionality, reliability, and validity indicated that both scales successfully differentiated between cooperation partners of varying cooperative quality and showed construct validity in line with expectations. The scales provide a basis for empirical investigation and system evaluation across a wide range of human-AI cooperation contexts.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.