Neural Galerkin Normalizing Flow for Transition Probability Density Functions of Diffusion Models

arXiv cs.LG / 3/20/2026

📰 NewsModels & Research

共有:

Key Points

The paper introduces a Neural Galerkin Normalizing Flow framework to approximate the transition probability density function of a diffusion process by solving the Fokker-Planck equation with an atomic initial distribution, parameterized by the initial mass location.
Normalizing Flows are used to express the solution as a transformation of the transition density of a reference stochastic process, ensuring positivity and mass conservation constraints.
The approach extends Neural Galerkin methods to Normalizing Flows and derives an ordinary differential equation (ODE) system for the time evolution of the flow parameters.
Adaptive sampling targets the Fokker-Planck residual in informative regions to address high-dimensional PDEs, enabling accurate capture of key solution features and causal relations between initial data and future densities.
After offline training, online evaluation becomes significantly cheaper than solving the PDE from scratch, positioning the method as a promising surrogate for many-query problems like Bayesian inference, simulation, and diffusion-bridge generation.

Abstract

We propose a new Neural Galerkin Normalizing Flow framework to approximate the transition probability density function of a diffusion process by solving the corresponding Fokker-Planck equation with an atomic initial distribution, parametrically with respect to the location of the initial mass. By using Normalizing Flows, we look for the solution as a transformation of the transition probability density function of a reference stochastic process, ensuring that our approximation is structure-preserving and automatically satisfies positivity and mass conservation constraints. By extending Neural Galerkin schemes to the context of Normalizing Flows, we derive a system of ODEs for the time evolution of the Normalizing Flow's parameters. Adaptive sampling routines are used to evaluate the Fokker-Planck residual in meaningful locations, which is of vital importance to address high-dimensional PDEs. Numerical results show that this strategy captures key features of the true solution and enforces the causal relationship between the initial datum and the density function at subsequent times. After completing an offline training phase, online evaluation becomes significantly more cost-effective than solving the PDE from scratch. The proposed method serves as a promising surrogate model, which could be deployed in many-query problems associated with stochastic differential equations, like Bayesian inference, simulation, and diffusion bridge generation.

When AI Grows Up: Identity, Memory, and What Persists Across Versions

Dev.to

OpenAI is throwing everything into building a fully automated researcher

MIT Technology Review

Kimi just published a paper replacing residual connections in transformers. results look legit

Reddit r/LocalLLaMA

機械学習の最適化対象まとめ（E資格対策にも）

Qiita

14 Best Self-Hosted Claude Alternatives for AI and Coding in 2026

Dev.to

Neural Galerkin Normalizing Flow for Transition Probability Density Functions of Diffusion Models

Key Points

Abstract

Related Articles

When AI Grows Up: Identity, Memory, and What Persists Across Versions

OpenAI is throwing everything into building a fully automated researcher

Kimi just published a paper replacing residual connections in transformers. results look legit

機械学習の最適化対象まとめ（E資格対策にも）

14 Best Self-Hosted Claude Alternatives for AI and Coding in 2026

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer