Stories in Space: In-Context Learning Trajectories in Conceptual Belief Space
Eric Bigelow, Raphaël Sarfati, Daniel Wurgaft, Owen Lewis, Thomas McGrath + 3 more
TLDR
LLMs update beliefs in a low-dimensional conceptual space, showing in-context learning as trajectories through this space, grounded in structured representations.
Key contributions
- Belief updates in LLMs form trajectories on low-dimensional, structured conceptual manifolds.
- This conceptual structure is consistently reflected in both model behavior and internal representations.
- Simple linear probes can decode internal representations to predict LLM behavior.
- Interventions on representations causally steer belief trajectories, predictable from conceptual space geometry.
Why it matters
This paper offers a geometric framework for understanding how LLMs update beliefs during in-context learning. It reveals that this process occurs within a structured, low-dimensional conceptual space, providing a concrete basis for Bayesian interpretations.
Original Abstract
Large Language Models (LLMs) update their behavior in context, which can be viewed as a form of Bayesian inference. However, the structure of the latent hypothesis space over which this inference operates remains unclear. In this work, we propose that LLMs assign beliefs over a low-dimensional geometric space - a conceptual belief space - and that in-context learning corresponds to a trajectory through this space as beliefs are updated over time. Using story understanding as a natural setting for dynamic belief updating, we combine behavioral and representational analyses to study these trajectories. We find that (1) belief updates are well-described as trajectories on low-dimensional, structured manifolds; (2) this structure is reflected consistently in both model behavior and internal representations and can be decoded with simple linear probes to predict behavior; and (3) interventions on these representations causally steer belief trajectories, with effects that can be predicted from the geometry of the conceptual space. Together, our results provide a geometric account of belief dynamics in LLMs, grounding Bayesian interpretations of in-context learning in structured conceptual representations.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.