On the Interpolation Effect of Score Smoothing in Diffusion Models

arXiv stat.ML / 4/21/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper investigates a hypothesis that diffusion models’ ability to generate novel data is driven by the neural network learning a smoothed version of the empirical score function that shapes denoising dynamics.
  • By analyzing scenarios where the training data lie uniformly in a one-dimensional subspace, the authors derive analytical insights and validate them with numerical experiments.
  • The results show that score-function smoothing can make denoised samples interpolate the training data along the subspace.
  • The study further provides theoretical and empirical evidence that neural-network-based score learning—whether regularized explicitly or not—can produce similar interpolation effects, even on simple nonlinear manifolds.

Abstract

Diffusion models have achieved remarkable progress in various domains with an intriguing ability to produce new data that do not exist in the training set. In this work, we study the hypothesis that such creativity arises from the neural network backbone learning a smoothed version of the empirical score function, which guides the denoising dynamics to generate data points that interpolate the training data. Focusing mainly on settings where the training set lies uniformly in a one-dimensional subspace, we elucidate the interplay between score smoothing and the denoising dynamics with analytical solutions and numerical experiments, demonstrating how smoothing the score function can cause the denoised data samples to interpolate the training set along the subspace. Moreover, we present theoretical and empirical evidence that learning score functions with neural networks - either with or without explicit regularization - can naturally achieve a similar effect, including when the data belong to simple nonlinear manifolds.