Alignment has a Fantasia Problem
Nathanael Jo, Zoe De Simone, Mitchell Gordon, Ashia Wilson
TLDR
AI often fails when users' goals are unformed; this paper rethinks alignment to help users refine intent, bridging ML, design, and behavioral science.
Key contributions
- Identifies "Fantasia interactions" where AI assumes fully formed user intent, leading to misalignment.
- Argues for rethinking AI alignment to provide cognitive support for users to refine their goals.
- Proposes an interdisciplinary approach combining ML, interface design, and behavioral science.
- Outlines a research agenda for designing AI that helps users navigate task uncertainty.
Why it matters
This paper challenges the fundamental assumption of user intent in AI alignment. It proposes a new paradigm where AI actively assists users in goal formation, crucial for developing truly helpful and aligned systems. This shift is vital for future human-AI collaboration.
Original Abstract
Modern AI assistants are trained to follow instructions, implicitly assuming that users can clearly articulate their goals and the kind of assistance they need. Decades of behavioral research, however, show that people often engage with AI systems before their goals are fully formed. When AI systems treat prompts as complete expressions of intent, they can appear to be useful or convenient, but not necessarily aligned with the users' needs. We call these failures Fantasia interactions. We argue that Fantasia interactions demand a rethinking of alignment research: rather than treating users as rational oracles, AI should provide cognitive support by actively helping users form and refine their intent through time. This requires an interdisciplinary approach that bridges machine learning, interface design, and behavioral science. We synthesize insights from these fields to characterize the mechanisms and failures of Fantasia interactions. We then show why existing interventions are insufficient, and propose a research agenda for designing and evaluating AI systems that better help humans navigate uncertainty in their tasks.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.