A unified perspective on fine-tuning and sampling with diffusion and flow models

arXiv stat.ML / 5/4/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies how to train diffusion and flow generative models to sample from target distributions formed via exponential tilting of a base density, covering both sampling from unnormalized targets and reward fine-tuning of pretrained models.
It proposes a unified framework that connects two viewpoints—stochastic optimal control (SOC) using adjoint/score-matching methods and non-equilibrium thermodynamics.
The authors provide bias–variance decompositions showing that Adjoint Matching/Sampling and Novel Score Matching have finite gradient variance, while Target and Conditional Score Matching can have non-finite variance.
They also derive norm bounds for the lean adjoint ODE, support the theoretical effectiveness of adjoint-based methods, and extend CMCD/NETS losses plus Crooks and Jarzynski identities to the exponential-tilting setting.
Experimental validation is provided via reward fine-tuning on Stable Diffusion 1.5 and Stable Diffusion 3, demonstrating the practical relevance of the theoretical results.

Abstract

We study the problem of training diffusion and flow generative models to sample from target distributions defined by an exponential tilting of a base density; a formulation that subsumes both sampling from unnormalized densities and reward fine-tuning of pre-trained models. This problem can be approached from a stochastic optimal control (SOC) perspective, using adjoint-based or score matching methods, or from a non-equilibrium thermodynamics perspective. We provide a unified framework encompassing these approaches and make three main contributions: (i) bias-variance decompositions revealing that Adjoint Matching/Sampling and Novel Score Matching have finite gradient variance, while Target and Conditional Score Matching do not; (ii) norm bounds on the lean adjoint ODE that theoretically support the effectiveness of adjoint-based methods; and (iii) adaptations of the CMCD and NETS loss functions, along with novel Crooks and Jarzynski identities, to the exponential tilting setting. We validate our analysis with reward fine-tuning experiments on Stable Diffusion 1.5 and 3.

AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs

Anthropic News

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI

The Verge

CLMA Frame Test

Dev.to

You Are Right — You Don't Need CLAUDE.md

Dev.to

Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions

Dev.to

A unified perspective on fine-tuning and sampling with diffusion and flow models

Key Points

Abstract

Related Articles

AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI

CLMA Frame Test

You Are Right — You Don't Need CLAUDE.md

Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer