Lecture Notes on Statistical Physics and Neural Networks

May 7, 20262605.06394

cond-mat.dis-nnhep-th

TLDR

This paper introduces statistical physics concepts, including phase transitions and renormalization group, relevant to neural networks and deep learning.

Key contributions

Presents statistical physics as probability theory, making phase transitions and renormalization group accessible.
Introduces Boltzmann-Gibbs distribution, thermodynamic potentials, Ising spins, and spin-glass models.
Explores Hopfield networks, Boltzmann machines, and learning algorithms for restricted Boltzmann machines.
Connects these concepts to modern deep learning and provides a description of large language models.

Why it matters

This paper bridges classical statistical physics with modern neural networks, making complex concepts like phase transitions accessible. It offers a foundational understanding for researchers and students exploring deep learning's theoretical underpinnings. This interdisciplinary approach is crucial for advancing AI theory.

Original Abstract

These lecture notes introduce some topics of classical statistical physics, particularly those that are relevant for neural networks and deep learning. Statistical physics is treated as a branch of probability theory or statistics, with the goal of making concepts such as phase transitions and the renormalization group accessible to readers without prior knowledge of physics. We introduce the Boltzmann-Gibbs distribution and the thermodynamic potentials on a finite configuration space, notably for Ising spins and spin-glass models on a lattice, and then define phase transitions as discontinuities that arise in the limit that the number of lattice points goes to infinity. We further introduce Hopfield networks and Boltzmann machines, which are governed by the same energy function as spin-glass models, and discuss the learning algorithm for restricted Boltzmann machines. In this algorithm hidden neurons are integrated out as in the renormalization group. Finally, modern deep learning is introduced, whose early developments were in part motivated by restricted Boltzmann machines in that they carry many layers of hidden neurons. A description of large language models is given.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers