Learning Discrete Diffusion of Graphs via Free-Energy Gradient Flows

arXiv stat.ML / 4/14/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a theoretical gap by extending gradient-flow frameworks (Wasserstein-2/JKO) to diffusion models on discrete spaces like graphs, where translating continuous Wasserstein metrics is difficult.
  • It introduces a computational approach based on a new metric W_K on the probability simplex, showing that common discrete diffusion dynamics (e.g., the discrete heat equation) can be interpreted as gradient flows of specific free-energy functionals.
  • Using this interpretation, the authors propose a learning method that recovers the underlying diffusion functional directly from first-order optimality conditions from the JKO scheme.
  • The proposed training objective is a simple quadratic loss, is reported to train extremely fast, and avoids needing individual sample trajectories, relying instead on preprocessing to compute W_K-geodesics.
  • Extensive synthetic experiments across multiple graph classes demonstrate the method’s ability to recover the underlying functional, supporting its practical viability for learning discrete diffusion dynamics.

Abstract

Diffusion-based models on continuous spaces have seen substantial recent progress through the mathematical framework of gradient flows, leveraging the Wasserstein-2 ({W}_2) metric via the Jordan-Kinderlehrer-Otto (JKO) scheme. Despite the increasing popularity of diffusion models on discrete spaces using continuous-time Markov chains, a parallel theoretical framework based on gradient flows has remained elusive due to intrinsic challenges in translating the {W}_2 distance directly into these settings. In this work, we propose the first computational approach addressing these challenges, leveraging an appropriate metric W_K on the simplex of probability distributions, which enables us to interpret widely used discrete diffusion paths, such as the discrete heat equation, as gradient flows of specific free-energy functionals. Through this theoretical insight, we introduce a novel methodology for learning diffusion dynamics over discrete spaces, which recovers the underlying functional directly by leveraging first-order optimality conditions for the JKO scheme. The resulting method optimizes a simple quadratic loss, trains extremely fast, does not require individual sample trajectories, and only needs a numerical preprocessing computing W_K-geodesics. We validate our method through extensive numerical experiments on synthetic data, showing that we can recover the underlying functional for a variety of graph classes.