Sharp Capacity Thresholds in Linear Associative Memory: From Winner-Take-All to Listwise Retrieval

May 6, 20262605.05189

Nicholas Barnfield, Juno Kim, Eshaan Nichani, Jason D. Lee, Yue M. Lu

stat.MLcs.ITcs.LG

TLDR

This paper reveals sharp capacity thresholds for linear associative memories, showing $n \log n$ for winner-take-all and $n$ for listwise retrieval.

Key contributions

Proves linear associative memory capacity scales as $d^2 \asymp n \log n$ for winner-take-all retrieval.
Introduces Tail-Average Margin (TAM) for listwise retrieval, where targets are among top candidates.
Demonstrates listwise retrieval with TAM achieves a higher capacity of $d^2 \asymp n$.
Develops an exact asymptotic theory for TAM, predicting critical loads and score distributions.

Why it matters

This work clarifies the fundamental limits of linear associative memories, linking capacity directly to retrieval criteria. It provides a new theoretical framework for understanding memory performance beyond simple winner-take-all, with implications for neural networks and machine learning.

Original Abstract

How many key-value associations can a $d\times d$ linear memory store? We show that the answer depends not only on the $d^2$ degrees of freedom in the memory matrix, but also on the retrieval criterion. In an isotropic Gaussian model for the stored pairs, we show that top-1 retrieval, where every signal must beat its largest distractor, requires the logarithmic model-size scale $d^2\asymp n\log n$. We prove that the correlation matrix memory construction, which stores associations by superposing key-target outer products, achieves this scale through a sharp phase transition, and that the same scaling is necessary for any linear memory. Thus the logarithm is the intrinsic extreme-value price of winner-take-all decoding. We next consider listwise retrieval, where the correct target need not be the unique top-scoring item but should remain among the strongest candidates. To formalize this regime, we propose the Tail-Average Margin (TAM), a convex upper-tail criterion that certifies inclusion of the correct target in a controlled candidate list. Under this listwise retrieval criterion, the capacity follows the quadratic scale $d^2\asymp n$. At load $n/d^2\toα$, we develop an exact asymptotic theory for the TAM empirical-risk minimizer through a two-parameter scalar variational principle. The theory has a rich phenomenology: in the ridgeless limit it yields a closed-form critical load separating satisfiable and unsatisfiable phases, and it predicts the limiting laws of true scores, competitor scores, margins, and percentile profiles. Finally, a small-tail extrapolation further leads to the conjectural sharp top-1 threshold $d^2\sim 2n\log n$.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers