Dual Alignment Between Language Model Layers and Human Sentence Processing
Tatsuki Kuribayashi, Alex Warstadt, Yohei Oseki, Ethan Gotlieb Wilcox
TLDR
This paper reveals that later LLM layers better model human cognitive effort for complex syntax, while early layers suffice for simple reading.
Key contributions
- Investigates LLM layer alignment with human cognitive effort during syntactic ambiguity processing.
- Shows later LLM layers better estimate human effort for complex syntax, contrasting with earlier layers.
- Proposes a dual alignment: early LLM layers for naturalistic reading, later layers for challenging syntax.
- Explores new probability-update measures, improving reading time modeling beyond single-layer surprisal.
Why it matters
This paper refines our understanding of how LLMs align with human sentence processing across varying syntactic complexities. It highlights that different LLM layers mimic distinct human processing modes, paving the way for more cognitively plausible language models.
Original Abstract
A recent study (Kuribayashi et al., 2025) has shown that human sentence processing behavior, typically measured on syntactically unchallenging constructions, can be effectively modeled using surprisal from early layers of large language models (LLMs). This raises the question of whether such advantages of internal layers extend to more syntactically challenging constructions, where surprisal has been reported to underestimate human cognitive effort. In this paper, we begin by exploring internal layers that better estimate human cognitive effort observed in syntactic ambiguity processing in English. Our experiments show that, in contrast to naturalistic reading, later layers better estimate such a cognitive effort, but still underestimate the human data. This dual alignment sheds light on different modes of sentence processing in humans and LMs: naturalistic reading employs a somewhat weak prediction akin to earlier layers of LMs, while syntactically challenging processing requires more fully-contextualized representations, better modeled by later layers of LMs. Motivated by these findings, we also explore several probability-update measures using shallow and deep layers of LMs, showing a complementary advantage to single-layer's surprisal in reading time modeling.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.