Superposition Is Not Necessary: A Mechanistic Interpretability Analysis of Transformer Representations for Time Series Forecasting
TLDR
Transformers for time series forecasting don't rely on superposition, showing sparse representations unlike NLP models and explaining linear model competitiveness.
Key contributions
- A single-layer, narrow transformer matches deeper models' forecasting performance.
- Sparse Autoencoders (SAEs) reveal sparse and stable FFN representations in PatchTST.
- No empirical evidence of strong superposition in time series transformer representations.
- Targeted causal interventions on latent features minimally perturb forecasts.
Why it matters
This paper reveals that transformer success in time series forecasting doesn't require complex superposition, unlike in NLP. It explains why simpler linear models remain competitive, suggesting time series tasks may not demand rich compositional representations.
Original Abstract
Transformer architectures have been widely adopted for time series forecasting, yet whether the representational mechanisms that make them powerful in NLP actually engage on time series data remains unexplored. The persistent competitiveness of simple linear models such as DLinear has fueled ongoing debate, but no mechanistic explanation for this phenomenon has been offered. We address this gap by applying sparse autoencoders (SAEs), a tool from mechanistic interpretability, to probe the internal representations of PatchTST. We first establish that a single-layer, narrow-dimensional transformer matches the forecasting performance of deeper configurations across commonly used benchmarks. We then train SAEs on the post-GELU intermediate FFN activations with dictionary sizes ranging from 0.5x to 4.0x the native dimensionality. Expanding the dictionary yields negligible downstream performance change (average 0.214%), with large portions of overcomplete dictionaries remaining inactive. Targeted causal interventions on dominant latent features produce minimal forecast perturbation. Across all evaluated settings, we observe no empirical evidence that the analyzed FFN representations rely on strong superposition. Instead, the representations remain sparse, stable under aggressive dictionary expansion, and largely insensitive to latent interventions. These results demonstrate that superposition is not necessary for competitive performance on standard forecasting benchmarks, suggesting they may not demand the rich compositional representations that drive transformer success in language modeling, and helping explain the persistent competitiveness of simple linear models
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.