Multirate Stein Variational Gradient Descent for Efficient Bayesian Sampling

arXiv cs.LG / 4/7/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that standard SVGD uses one global step size for both attraction (toward high-posterior regions) and repulsion (particle diversity), which can fail or become inefficient when those dynamics differ across parts of a high-dimensional posterior.
  • It derives a multirate SVGD framework that updates the two components on different time scales, proposing a symmetric split method plus fixed (MR-SVGD) and adaptive (Adapt-MR-SVGD) variants with local error control.
  • The authors evaluate the multirate methods on multiple benchmark families (including anisotropic, multimodal, hierarchical posteriors, Bayesian neural networks, and logistic regression) using posterior-matching, predictive performance, calibration, mixing, and detailed compute-cost accounting.
  • Results show multirate SVGD variants improve the robustness and quality–cost tradeoff versus vanilla SVGD overall, with the biggest gains for stiff hierarchical, strongly anisotropic, and multimodal targets.
  • The adaptive multirate method typically performs best, while fixed multirate SVGD is positioned as a simpler, lower-cost robust alternative.

Abstract

Many particle-based Bayesian inference methods use a single global step size for all parts of the update. In Stein variational gradient descent (SVGD), however, each update combines two qualitatively different effects: attraction toward high-posterior regions and repulsion that preserves particle diversity. These effects can evolve at different rates, especially in high-dimensional, anisotropic, or hierarchical posteriors, so one step size can be unstable in some regions and inefficient in others. We derive a multirate version of SVGD that updates these components on different time scales. The framework yields practical algorithms, including a symmetric split method, a fixed multirate method (MR-SVGD), and an adaptive multirate method (Adapt-MR-SVGD) with local error control. We evaluate the methods in a broad and rigorous benchmark suite covering six problem families: a 50D Gaussian target, multiple 2D synthetic targets, UCI Bayesian logistic regression, multimodal Gaussian mixtures, Bayesian neural networks, and large-scale hierarchical logistic regression. Evaluation includes posterior-matching metrics, predictive performance, calibration quality, mixing, and explicit computational cost accounting. Across these six benchmark families, multirate SVGD variants improve robustness and quality-cost tradeoffs relative to vanilla SVGD. The strongest gains appear on stiff hierarchical, strongly anisotropic, and multimodal targets, where adaptive multirate SVGD is usually the strongest variant and fixed multirate SVGD provides a simpler robust alternative at lower cost.