Adaptive Nonparametric Perturbations of Parametric Models with Generalized Bayes

arXiv stat.ML / 4/3/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes semiparametric corrections to parametric Bayesian models to make inference more reliable when the parametric specification may be wrong, focusing on functionals of the true data distribution.
  • It starts from a fully Bayesian framework that explicitly models misspecification and shows via asymptotic analysis that the approach can be both robust and data efficient, with fast convergence when the parametric model is close to reality.
  • The authors argue that fully Bayesian inference becomes impractical because it requires computing Bayes factors for a nonparametric model, which is computationally challenging.
  • They introduce a generalized Bayes-based correction method that avoids nonparametric Bayes factor computation while aiming to preserve the robustness and efficiency properties of the fully Bayesian approach.
  • The method is demonstrated by estimating causal effects of gene expression from single-cell RNA-seq data.

Abstract

Parametric Bayesian modeling offers a powerful and flexible toolbox for machine learning. Yet the model, however detailed, may still be wrong, and this can make inferences untrustworthy. In this paper we introduce a new class of semiparametric corrections for parametric Bayesian models, when the target of inference is a functional of the true data distribution. Our starting point is a fully Bayesian modeling approach, which explicitly accounts for the possibility that the parametric model is wrong. Asymptotic analysis shows that this approach is both robust to model misspecification and data efficient, achieving fast convergence when the parametric model is close to true. However, the fully Bayesian approach is limited in its practical usefulness by the challenges of conducting inference and computing a Bayes factor for a nonparametric model. We therefore propose a novel model correction based on generalized Bayes, which entirely avoids the need to compute a nonparametric Bayes factor, but preserves the robustness and efficiency of the fully Bayesian approach. We demonstrate our method by estimating causal effects of gene expression from single cell RNA sequencing data. Overall, we offer a new efficient approach to robust Bayesian inference with parametric models.