A unified perspective on fine-tuning and sampling with diffusion and flow models
arXiv stat.ML / 5/4/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper studies how to train diffusion and flow generative models to sample from target distributions formed via exponential tilting of a base density, covering both sampling from unnormalized targets and reward fine-tuning of pretrained models.
- It proposes a unified framework that connects two viewpoints—stochastic optimal control (SOC) using adjoint/score-matching methods and non-equilibrium thermodynamics.
- The authors provide bias–variance decompositions showing that Adjoint Matching/Sampling and Novel Score Matching have finite gradient variance, while Target and Conditional Score Matching can have non-finite variance.
- They also derive norm bounds for the lean adjoint ODE, support the theoretical effectiveness of adjoint-based methods, and extend CMCD/NETS losses plus Crooks and Jarzynski identities to the exponential-tilting setting.
- Experimental validation is provided via reward fine-tuning on Stable Diffusion 1.5 and Stable Diffusion 3, demonstrating the practical relevance of the theoretical results.
Related Articles
AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs
Anthropic News

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI
The Verge

CLMA Frame Test
Dev.to

You Are Right — You Don't Need CLAUDE.md
Dev.to

Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions
Dev.to