ArXiv TLDR

Bandits on graphs and structures

🐦 Tweet
2605.03493

Michal Valko

cs.LGstat.ML

TLDR

This thesis explores graph and structured bandit problems, addressing practical challenges in sequential decision-making with large action spaces.

Key contributions

  • Investigates graph bandits, covering spectral bandits, side observations, and influence maximization.
  • Studies large action spaces, including kernel, polymatroid, and function optimization bandits.
  • Explores challenges like infinitely many-arms bandits and unknown reward smoothness.

Why it matters

This paper advances sequential decision-making by exploring structural properties of graph and structured bandits. It offers practical solutions for complex problems with large or infinite action spaces, pushing the boundaries of current algorithms.

Original Abstract

The goal of this thesis is to investigate the structural properties of certain sequential problems in order to bring the solutions closer to a practical use. In the first part, we put a special emphasis on structures that can be represented as graphs on actions. In the second part, we study the large action spaces that can be of exponential size in the number of base actions or even infinite. For graph bandits, we consider the settings of smoothness of rewards (spectral bandits), side observations, and influence maximization. For large structured domains, we cover kernel bandits, polymatroid bandits, bandits for function optimization (including unknown smoothness), and infinitely many-arms bandits. The thesis aspires to be a survey of the author's contributions on graph and structured bandits.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.