Understanding In-Context Learning for Nonlinear Regression with Transformers: Attention as Featurizer

May 6, 20262605.05176

Alexander Hsu, Zhaiming Shen, Wenjing Liao, Rongjie Lai

cs.LGmath.NA

TLDR

This paper theoretically analyzes in-context learning in transformers for nonlinear regression, showing attention creates nonlinear features.

Key contributions

Constructs transformers using attention to realize specific nonlinear features like polynomial or spline bases.
Develops a theoretical framework to analyze end-to-end in-context nonlinear regression.
Provides finite-sample generalization error bounds based on context length and training set size.

Why it matters

This paper bridges a critical gap in understanding in-context learning by extending theoretical analysis to nonlinear regression. By showing how attention can form complex nonlinear features, it provides a foundational framework for future research and practical applications of ICL beyond linear models.

Original Abstract

Pre-trained transformers are able to learn from examples provided as part of the prompt without any weight updates, a remarkable ability known as in-context learning (ICL). Despite its demonstrated efficacy across various domains, the theoretical understanding of ICL is still developing. Whereas most existing theory has focused on linear models, we study ICL in the nonlinear regression setting. Through the interaction mechanism in attention, we explicitly construct transformer networks to realize nonlinear features, such as polynomial or spline bases, which span a wide class of functions. Based on this construction, we establish a framework to analyze end-to-end in-context nonlinear regression with the constructed features. Our theory provides finite-sample generalization error bounds in terms of context length and training set size. We numerically validate the theory on synthetic regression tasks.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers