MTI: A Behavior-Based Temperament Profiling System for AI Agents

April 2, 20262604.02145

cs.AIcs.CL

TLDR

MTI is a new behavior-based system for profiling AI agent temperament across four axes: Reactivity, Compliance, Sociality, and Resilience.

Key contributions

Introduces MTI, a behavior-based system measuring AI temperament across Reactivity, Compliance, Sociality, and Resilience.
Uses structured examination protocols to separate AI capability from disposition, avoiding self-report bias.
Profiles 10 SLMs, finding axis independence, facet dissociations, and a Compliance-Resilience paradox.
Shows RLHF reshapes temperament and that temperament is independent of model size (1.7B-9B).

Why it matters

This paper introduces the first standardized, behavior-based system to measure AI agent temperament, addressing a gap in understanding dispositional differences. It provides crucial insights into how AI models behave, how training paradigms like RLHF influence these traits, and offers a framework for designing more predictable and robust AI.

Original Abstract

AI models of equivalent capability can exhibit fundamentally different behavioral patterns, yet no standardized instrument exists to measure these dispositional differences. Existing approaches either borrow human personality dimensions and rely on self-report (which diverges from actual behavior in LLMs) or treat behavioral variation as a defect rather than a trait. We introduce the Model Temperament Index (MTI), a behavior-based profiling system that measures AI agent temperament across four axes: Reactivity (environmental sensitivity), Compliance (instruction-behavior alignment), Sociality (relational resource allocation), and Resilience (stress resistance). Grounded in the Four Shell Model from Model Medicine, MTI measures what agents do, not what they say about themselves, using structured examination protocols with a two-stage design that separates capability from disposition. We profile 10 small language models (1.7B-9B parameters, 6 organizations, 3 training paradigms) and report five principal findings: (1) the four axes are largely independent among instruction-tuned models (all |r| < 0.42); (2) within-axis facet dissociations are empirically confirmed -- Compliance decomposes into fully independent formal and stance facets (r = 0.002), while Resilience decomposes into inversely related cognitive and adversarial facets; (3) a Compliance-Resilience paradox reveals that opinion-yielding and fact-vulnerability operate through independent channels; (4) RLHF reshapes temperament not only by shifting axis scores but by creating within-axis facet differentiation absent in the unaligned base model; and (5) temperament is independent of model size (1.7B-9B), confirming that MTI measures disposition rather than capability.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers