Balanced Co-Clustering of Users and Items for Embedding Table Compression in Recommender Systems

April 20, 20262604.18351

cs.IRcs.LG

TLDR

BACO compresses recommender system embedding tables by over 75% with minimal accuracy loss and significant speedup using balanced co-clustering.

Key contributions

Introduces BACO, a novel framework for compressing recommender system embedding tables.
Uses balanced co-clustering to group similar users/items, sharing embeddings for efficiency.
Achieves over 75% parameter reduction with <1.85% recall drop and up to 346X speedup.
Employs a principled weighting scheme and efficient label propagation to prevent codebook collapse.

Why it matters

Recommender systems rely on large embedding tables, which are resource-intensive. BACO offers a practical solution to deploy these systems under constraints. It significantly reduces memory and computation while maintaining high recommendation accuracy, outperforming existing methods.

Original Abstract

Recommender systems have advanced markedly over the past decade by transforming each user/item into a dense embedding vector with deep learning models. At industrial scale, embedding tables constituted by such vectors of all users/items demand a vast amount of parameters and impose heavy compute and memory overhead during training and inference, hindering model deployment under resource constraints. Existing solutions towards embedding compression either suffer from severely compromised recommendation accuracy or incur considerable computational costs. To mitigate these issues, this paper presents BACO, a fast and effective framework for compressing embedding tables. Unlike traditional ID hashing, BACO is built on the idea of exploiting collaborative signals in user-item interactions for user and item groupings, such that similar users/items share the same embeddings in the codebook. Specifically, we formulate a balanced co-clustering objective that maximizes intra-cluster connectivity while enforcing cluster-volume balance, and unify canonical graph clustering techniques into the framework through rigorous theoretical analyses. To produce effective groupings while averting codebook collapse, BACO instantiates this framework with a principled weighting scheme for users and items, an efficient label propagation solver, as well as secondary user clusters. Our extensive experiments comparing BACO against full models and 18 baselines over benchmark datasets demonstrate that BACO cuts embedding parameters by over 75% with a drop of at most 1.85% in recall, while surpassing the strongest baselines by being up to 346X faster.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers