ArXiv TLDR

Stayin' Aligned Over Time: Towards Longitudinal Human-LLM Alignment via Contextual Reflection and Privacy-Preserving Behavioral Data

🐦 Tweet
2605.04029

Simret Araya Gebreegziabher, Allison E Sproul, Yinuo Yang, Chaoran Chen, Diego Gómez-Zará + 1 more

cs.HC

TLDR

A new framework for longitudinal human-LLM alignment reveals that user preferences change over time, challenging static evaluation methods.

Key contributions

  • Proposes a methodological shift from single-moment to longitudinal LLM alignment evaluation.
  • Introduces a framework combining in-situ preference capture, follow-up reflection, and behavioral traces.
  • Presents BITE, a browser-based system for detecting consequential LLM interactions and prompting reflection.
  • A 2-week study showed significant differences between immediate and later user preferences in LLM outputs.

Why it matters

Current LLM alignment methods assume static preferences, which this paper challenges. It introduces a longitudinal framework and system (BITE) to capture how user preferences evolve over time. This work is crucial for developing more robust and truly aligned AI systems that adapt to real-world consequences, moving beyond immediate feedback.

Original Abstract

Current human-AI alignment and evaluation methods for large language models (LLMs) often rely on preference signals collected immediately after an interaction. This practice implicitly treats preference as static, even though many LLM-mediated decisions unfold over time and may be re-evaluated differently after real-world consequences and observed outcomes. Therefore, we argue for a methodological shift from single-moment preference elicitation to longitudinal, context-situated alignment measurement. We present a methodological framework for collecting temporally grounded alignment signals by combining (1) in-situ preference capture, (2) context-triggered follow-up preference reflection, and (3) privacy-preserving behavioral traces that help interpret preference change. As an instantiation of this methodology, we introduce BITE, a browser-based system that detects consequential LLM interactions, prompts reflection across later decision points, and supports progressive, user-controlled consent for sharing behavioral data. Through a two week longitudinal deployment study with 8 participants, our approach surfaced differences between immediate and later user preferences in accuracy, relevance and other dimensions of the LLM output. Our findings highlight the limitations of single-moment preference datasets and underscore the importance of longitudinal methods for alignment evaluation in everyday use.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.