A unified perspective on fine-tuning and sampling with diffusion and flow models

April 30, 20262605.00229

Carles Domingo-Enrich, Yuanqi Du, Michael S. Albergo

stat.MLcs.LGmath.OC

TLDR

This paper unifies fine-tuning and sampling for diffusion/flow models using exponential tilting, revealing gradient variance properties and supporting adjoint-based methods.

Key contributions

Bias-variance decompositions show finite gradient variance for Adjoint Matching/Sampling and Novel Score Matching.
Provides norm bounds on the lean adjoint ODE, theoretically supporting adjoint-based methods' effectiveness.
Adapts CMCD/NETS losses and introduces novel Crooks/Jarzynski identities for exponential tilting.

Why it matters

This paper offers a unified theoretical framework for fine-tuning and sampling in diffusion models, crucial for understanding and improving methods like reward fine-tuning. Its insights into gradient variance and adjoint-based methods can lead to more stable and efficient training.

Original Abstract

We study the problem of training diffusion and flow generative models to sample from target distributions defined by an exponential tilting of a base density; a formulation that subsumes both sampling from unnormalized densities and reward fine-tuning of pre-trained models. This problem can be approached from a stochastic optimal control (SOC) perspective, using adjoint-based or score matching methods, or from a non-equilibrium thermodynamics perspective. We provide a unified framework encompassing these approaches and make three main contributions: (i) bias-variance decompositions revealing that Adjoint Matching/Sampling and Novel Score Matching have finite gradient variance, while Target and Conditional Score Matching do not; (ii) norm bounds on the lean adjoint ODE that theoretically support the effectiveness of adjoint-based methods; and (iii) adaptations of the CMCD and NETS loss functions, along with novel Crooks and Jarzynski identities, to the exponential tilting setting. We validate our analysis with reward fine-tuning experiments on Stable Diffusion 1.5 and 3.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers