Regret Analysis of Guided Diffusion for Black-Box Optimization over Structured Inputs

May 11, 20262605.10385

Masaki Adachi, Anita Yang, Yakun Wang, Song Liu

stat.MLcs.LG

TLDR

This paper introduces a novel regret analysis framework for guided-diffusion black-box optimization, explaining its strong performance on structured inputs.

Key contributions

Introduces a first certificate-based expected simple-regret framework for guided-diffusion BO.
Avoids traditional BO analysis assumptions like max information gain or exact acquisition.
Defines 'mass lift' to explain exponential convergence and polynomial acceleration in diffusion-BO.
Offers practical diagnostics for search exponents and a proposal-corrected certified sampler.

Why it matters

This paper fills a critical gap in understanding the theoretical performance of guided-diffusion black-box optimization, which has shown strong empirical results. It provides a novel regret analysis framework that aligns with modern diffusion-BO pipelines, explaining their efficiency. This work bridges theory and practice, enabling more robust development of structured design optimization.

Original Abstract

Guided-diffusion black-box optimization (BO) has shown strong empirical performance on structured design problems such as molecules and crystals, but its regret behavior remains poorly understood. Existing BO regret analyses typically rely on maximum information gain, non-pretrained surrogate models, or exact acquisition maximization -- assumptions that break down in modern diffusion -- BO pipelines, where pretrained diffusion models serve as powerful priors over valid structures and acquisition maximization is replaced by approximate sampling over astronomically large discrete spaces. We develop a first certificate-based expected simple-regret framework for guided-diffusion BO that avoids maximum-information-gain bounds, RKHS assumptions, and exact acquisition maximization. The central quantity in our analysis is mass lift: the increase in probability mass assigned to near-optimal designs relative to the pretrained generator. This view explains how exponential-looking finite-budget convergence and polynomial acceleration can all arise from the same mechanism. We also give practical diagnostics for estimating search exponents from finite candidate pools and a proposal-corrected resampling construction that provides a fully certified sampler instance.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers