Response Time Enhances Alignment with Heterogeneous Preferences

May 7, 20262605.06987

Federico Echenique, Alireza Fallah, Baihe Huang, Michael I. Jordan

cs.LGcs.GTecon.THstat.ML

TLDR

This paper shows that using user response times can accurately align LLMs with diverse human preferences, overcoming limitations of standard choice-only methods.

Key contributions

Standard LLM alignment fails to capture diverse human preferences due to data aggregation.
Augmenting preference data with user response times restores identifiability of average preferences.
Introduces a novel Drift-Diffusion Model (DDM) based estimator for heterogeneous preferences.
Empirically outperforms baselines, converging to true average preference even with single choices.

Why it matters

This paper offers a crucial advancement in aligning LLMs with real-world human diversity by leveraging an easily obtainable signal: response time. It enables more accurate and socially beneficial LLM policies without privacy concerns, paving the way for improved data collection methods.

Original Abstract

Aligning large language models (LLMs) to human preferences typically relies on aggregating pooled feedback into a single reward model. However, this standard approach assumes that all labelers share the same underlying preferences, ignoring the fact that real-world labelers are highly heterogeneous and usually anonymous. Consequently, relying solely on binary choice data fundamentally distorts the learned policy, making the true population-average preference unidentifiable. To overcome this critical limitation, we demonstrate that augmenting preference datasets with a simple, secondary signal -- the user's response time -- can restore the identifiability of the population's average preference. By modeling each decision as a Drift-Diffusion Model (DDM), we introduce a novel, consistent estimator of heterogeneous preferences that successfully corrects the distortions of standard choice-only labels. We prove that our estimator asymptotically converges to the true average preference even in extreme cases where each anonymous labeler contributes only a single choice. Empirically, across both synthetic and real-world datasets, our method consistently outperforms standard baselines that otherwise fail and plateau at a bias floor. Because response times are essentially free to record and require zero user tracking or identification, our results bring promises and open up new opportunities for future data-collection pipelines to improve the social benefit without requiring user-level identifiers or repeated elicitations.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers