Joint Representation Learning and Clustering via Gradient-Based Manifold Optimization
Sida Liu, Yangzi Guo, Mingyuan Wang
TLDR
A new Manifold Learning Framework simultaneously learns data representations and clusters high-dimensional data using gradient-based optimization.
Key contributions
- Introduces a Manifold Learning Framework for simultaneously learning data representations and clusters.
- Employs Gradient Manifold Optimization to jointly optimize dimension reduction parameters and cluster assignments.
- Exemplified with Gaussian Mixture Models, outperforming popular clustering algorithms on benchmark datasets.
Why it matters
Clustering high-dimensional data is a long-standing challenge due to the curse of dimensionality. This paper offers a novel framework that jointly learns data representations and clusters, providing a more robust and effective solution. It improves performance over existing methods, making it valuable for various machine learning applications.
Original Abstract
Clustering and dimensionality reduction have been crucial topics in machine learning and computer vision. Clustering high-dimensional data has been challenging for a long time due to the curse of dimensionality. For that reason, a more promising direction is the joint learning of dimension reduction and clustering. In this work, we propose a Manifold Learning Framework that learns dimensionality reduction and clustering simultaneously. The proposed framework is able to jointly learn the parameters of a dimension reduction technique (e.g. linear projection or a neural network) and cluster the data based on the resulting features (e.g. under a Gaussian Mixture Model framework). The framework searches for the dimension reduction parameters and the optimal clusters by traversing a manifold,using Gradient Manifold Optimization. The obtained The proposed framework is exemplified with a Gaussian Mixture Model as one simple but efficient example, in a process that is somehow similar to unsupervised Linear Discriminant Analysis (LDA). We apply the proposed method to the unsupervised training of simulated data as well as a benchmark image dataset (i.e. MNIST). The experimental results indicate that our algorithm has better performance than popular clustering algorithms from the literature.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.