ArXiv TLDR

Self-Supervised Representation Learning via Hyperspherical Density Shaping

🐦 Tweet
2604.24498

Esteban Rodríguez-Betancourt, Edgar Casasola-Murillo

cs.CV

TLDR

HyDeS is a theoretically grounded self-supervised learning method using hyperspherical density shaping for multi-view mutual information maximization.

Key contributions

  • Proposes HyDeS, a theoretically grounded self-supervised learning method.
  • Maximizes multi-view mutual information in hyperspherical space using von Mises-Fisher density.
  • Improves performance on segmentation tasks like VOC PASCAL by focusing on foreground features.
  • Analyzes latent space geometry and learning dynamics to inform future SSL designs.

Why it matters

This paper introduces HyDeS, a theoretically grounded approach to self-supervised learning, addressing the empirical nature of many existing methods. It offers a principled way to maximize mutual information in hyperspherical space, showing strong performance in segmentation. The detailed analysis of its latent space can inform the design of future, more robust SSL techniques.

Original Abstract

Modern self-supervised representation learning methods often relies on empirical heuristics that are not theoretically grounded. In this study we propose HyDeS, a theoretically grounded method based on multi-view mutual information maximization within an hyperspherical space using Shannon differential entropy with a non-parametric von Mises-Fisher density estimator. We show that HyDeS bias the trained model towards focusing on foreground features of the images and perform well on segmentation tasks such as VOC PASCAL, while it lags in fine-grained classification. We provide a detailed analysis of the induced latent space geometry and learning dynamics, that can be used for designing other theoretically grounded self-supervised learning methods.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.