Bringing Clustering to MLL: Weakly-Supervised Clustering for Partial Multi-Label Learning

April 10, 20262604.09359

Yu Chen, Weijun Lv, Yue Huang, Xuhuan Zhu, Fang Li

cs.LG

TLDR

WSC-PML introduces a novel weakly-supervised clustering method for partial multi-label learning, effectively handling label noise through matrix decomposition.

Key contributions

Proposes WSC-PML, a weakly-supervised clustering approach to address label noise in partial multi-label learning.
Introduces a novel membership matrix decomposition (A = Π ⊙ F) to bridge clustering and multi-label assignments.
Integrates unsupervised clustering with multi-label supervision for robust noise handling in PML.
Employs a three-stage process: prototype learning, adaptive weak supervision, and iterative joint optimization.

Why it matters

This paper tackles the critical problem of label noise in partial multi-label learning, where traditional clustering is incompatible. It proposes WSC-PML, a novel method that uses matrix decomposition to effectively bridge clustering and multi-label assignments. This significantly improves MLL model reliability and outperforms existing state-of-the-art methods.

Original Abstract

Label noise in multi-label learning (MLL) poses significant challenges for model training, particularly in partial multi-label learning (PML) where candidate labels contain both relevant and irrelevant labels. While clustering offers a natural approach to exploit data structure for noise identification, traditional clustering methods cannot be directly applied to multi-label scenarios due to a fundamental incompatibility: clustering produces membership values that sum to one per instance, whereas multi-label assignments require binary values that can sum to any number. We propose a novel weakly-supervised clustering approach for PML (WSC-PML) that bridges clustering and multi-label learning through membership matrix decomposition. Our key innovation decomposes the clustering membership matrix $\mathbf{A}$ into two components: $\mathbf{A} = \mathbfΠ \odot \mathbf{F}$, where $\mathbfΠ$ maintains clustering constraints while $\mathbf{F}$ preserves multi-label characteristics. This decomposition enables seamless integration of unsupervised clustering with multi-label supervision for effective label noise handling. WSC-PML employs a three-stage process: initial prototype learning from noisy labels, adaptive confidence-based weak supervision construction, and joint optimization via iterative clustering refinement. Extensive experiments on 24 datasets demonstrate that our approach outperforms six state-of-the-art methods across all evaluation metrics.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers