ArXiv TLDR

A Finite Time Analysis of Thompson Sampling for Bayesian Optimization with Preferential Feedback

🐦 Tweet
2604.25025

Joseph Lazzaro, Davide Buffelli, Da-shan Shiu, Sattar Vakili

stat.MLcs.LG

TLDR

This paper introduces a Thompson Sampling method for Bayesian optimization with preferential feedback, achieving performance comparable to scalar feedback methods.

Key contributions

  • Proposes a Thompson Sampling approach for Bayesian optimization using pairwise preferential feedback.
  • Models comparisons with a monotone link on latent utility differences and leverages a dueling kernel.
  • Provides a finite-time analysis showing performance matches standard TS for scalar feedback.
  • Analysis exploits anchor invariance and introduces a novel double-TS pairing variant for challenger selection.

Why it matters

Preferential feedback is increasingly important in human-in-the-loop systems and scientific discovery. This work provides a robust theoretical foundation for applying Thompson Sampling in such settings, ensuring efficient optimization even without direct scalar scores.

Original Abstract

Preference feedback, in the form of pairwise comparisons rather than scalar scores, has seen increasing use in applications such as human-, laboratory-, and expert-in-the-loop design, as well as scientific discovery. We propose a Thompson Sampling (TS) approach to Bayesian optimization with preferential feedback that models comparisons using a monotone link on latent utility differences and leverages the dueling kernel induced by a base kernel. We provide a finite-time analysis showing that the performance of the proposed method matches that of standard TS for conventional Bayesian optimization with scalar feedback. The analysis exploits the anchor invariance of TS for challenger selection and introduces a double-TS pairing variant. We also demonstrate the performance of the method on both synthetic and real-world examples.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.