ArXiv TLDR

Spectral bandits

🐦 Tweet
2604.25272

Tomáš Kocák, Rémi Munos, Branislav Kveton, Shipra Agrawal, Michal Valko

stat.MLcs.AIcs.LG

TLDR

This paper introduces "spectral bandits," an online learning framework for graph-based problems like recommendations, using smooth payoffs and effective dimension.

Key contributions

  • Studies a bandit problem where arm payoffs are smooth on an underlying graph.
  • Introduces the 'effective dimension' for graph-based bandit problems, small in real-world graphs.
  • Proposes three algorithms that scale linearly or sublinearly with the effective dimension.
  • Demonstrates efficient learning of user preferences from few evaluations in content recommendation.

Why it matters

This work addresses online learning challenges on graphs, crucial for applications like content recommendation. By introducing the effective dimension and efficient algorithms, it enables scalable solutions for problems where traditional methods struggle with large graphs. The ability to learn preferences from minimal data is a significant practical advantage.

Original Abstract

Smooth functions on graphs have wide applications in manifold and semi-supervised learning. In this work, we study a bandit problem where the payoffs of arms are smooth on a graph. This framework is suitable for solving online learning problems that involve graphs, such as content-based recommendation. In this problem, each item we can recommend is a node of an undirected graph and its expected rating is similar to the one of its neighbors. The goal is to recommend items that have high expected ratings. We aim for the algorithms where the cumulative regret with respect to the optimal policy would not scale poorly with the number of nodes. In particular, we introduce the notion of an effective dimension, which is small in real-world graphs, and propose three algorithms for solving our problem that scale linearly and sublinearly in this dimension. Our experiments on content recommendation problem show that a good estimator of user preferences for thousands of items can be learned from just tens of node evaluations.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.