Variational inference via radial transport

arXiv stat.ML / 4/1/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that standard variational inference (VI) using Gaussian surrogates can fail when the target distribution’s radial profile is not well captured, leading to poor coverage.
  • It introduces radVI, an add-on method that optimizes the VI approximation by explicitly working with radial profiles rather than only location/scale parameters.
  • radVI is designed to be computationally cheap and compatible with existing VI approaches such as mean-field Gaussian VI and Laplace approximation.
  • The authors provide theoretical convergence guarantees by leveraging optimization in Wasserstein space and new regularity results for radial transport maps related to Caffarelli-style theory.
  • Overall, the work reframes VI as a radial-profile optimization problem, aiming to improve approximation quality for high-dimensional distributions where Gaussian assumptions are inadequate.

Abstract

In variational inference (VI), the practitioner approximates a high-dimensional distribution \pi with a simple surrogate one, often a (product) Gaussian distribution. However, in many cases of practical interest, Gaussian distributions might not capture the correct radial profile of \pi, resulting in poor coverage. In this work, we approach the VI problem from the perspective of optimizing over these radial profiles. Our algorithm radVI is a cheap, effective add-on to many existing VI schemes, such as Gaussian (mean-field) VI and Laplace approximation. We provide theoretical convergence guarantees for our algorithm, owing to recent developments in optimization over the Wasserstein space--the space of probability distributions endowed with the Wasserstein distance--and new regularity properties of radial transport maps in the style of Caffarelli (2000).