ArXiv TLDR

Linear equivalence of nonlinear recurrent neural networks

🐦 Tweet
2604.23489

David G. Clark

cond-mat.dis-nnq-bio.NC

TLDR

This paper proves that large nonlinear recurrent neural networks' covariance matrices are equivalent to those of linear networks, extending linear equivalence to recurrent systems.

Key contributions

  • Derives linear equivalence for RNN covariance matrices using the two-site cavity method.
  • Shows nonlinear residuals act as independent noise within a linear network decomposition.
  • Corrects naive Gaussian closure for covariance matrix via a self-consistent matrix equation.
  • Numerically verifies the theoretical predictions across a range of network sizes.

Why it matters

Understanding the collective activity of large nonlinear RNNs is vital across science and engineering. This paper provides a powerful analytical framework, linear equivalence, to precisely characterize their high-dimensional covariance structure. It extends this crucial theoretical tool to recurrent networks, offering deeper insights into complex system dynamics.

Original Abstract

Large nonlinear recurrent neural networks with random couplings generate high-dimensional, potentially chaotic activity whose structure is of interest in neuroscience, machine learning, ecology, and other fields. A fundamental object encoding the collective structure of this activity is the $N \times N$ covariance matrix. Prior analytical work on the covariance matrix has been limited to low-dimensional summary statistics, not the full high-dimensional object for a specific realization of the couplings. Recent work proposed an ansatz in which, at large $N$, the covariance matrix for a typical quenched realization takes the same form as that of a linear network with the same couplings, driven by independent noise, with mean-field order parameters setting the effective transfer function and the noise spectrum. Here, we derive this ansatz using the two-site cavity method, providing two distinct derivations that offer complementary perspectives. The first decomposes each unit's activity into a linear component and a nonlinear residual, and shows that cross-covariances between residuals at distinct sites are strongly suppressed, so that residuals act as independent noise within a linear network. The second writes a self-consistent matrix equation for the covariance matrix. A naive Gaussian closure for the joint statistics of activity at distinct sites gives the wrong equation; the cavity method separates Gaussian and non-Gaussian contributions, which enter at the same order, and produces the correct one. We verify the predictions numerically across a range of network sizes. These results extend linear equivalence from feedforward high-dimensional nonlinear systems, where the weights being analyzed are independent of their inputs, to recurrent networks, where the activities depend on the same couplings that generate them.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.