ArXiv TLDR

Understanding Self-Supervised Learning via Latent Distribution Matching

🐦 Tweet
2605.03517

Fabian A Mikulasch, Friedemann Zenke

cs.LGstat.ML

TLDR

This paper introduces Latent Distribution Matching (LDM) as a unifying theoretical framework for Self-Supervised Learning (SSL), explaining diverse methods and guiding new designs.

Key contributions

  • Introduces Latent Distribution Matching (LDM) as a unifying theoretical framework for SSL.
  • Defines SSL as maximizing latent log-probability (alignment) and entropy (uniformity).
  • Unifies diverse SSL methods like contrastive, non-contrastive, predictive, and ICA under LDM.
  • Derives a nonlinear, sampling-free Bayesian filtering model for high-dimensional timeseries.

Why it matters

Self-supervised learning (SSL) currently lacks a unifying theoretical framework. This paper introduces Latent Distribution Matching (LDM) to fill this gap, clarifying assumptions behind established SSL methods. LDM provides principled guidance for developing new approaches and yields practical models like a novel Bayesian filter.

Original Abstract

Self-supervised learning (SSL) excels at finding general-purpose latent representations from complex data, yet lacks a unifying theoretical framework that explains the diverse existing methods and guides the design of new ones. We cast SSL as latent distribution matching (LDM): learning representations that maximize their log-probability under an assumed latent model (alignment), while maximizing latent entropy to prevent collapse (uniformity). This view unifies independent component analysis with contrastive, non-contrastive, and predictive SSL methods, including stop gradient approaches. Leveraging LDM, we derive a nonlinear, sampling-free Bayesian filtering model with a Kalman-based predictor for high-dimensional timeseries. We further prove that predictive LDM yields identifiable latent representations under mild assumptions, even with nonlinear predictors. Overall, LDM clarifies the assumptions behind established SSL methods and provides principled guidance for developing new approaches.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.