Distributional Causal Mediation via Conditional Generative Modeling

arXiv stat.ML / 5/5/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces Distributional Causal Mediation Analysis (DCMA), a generative learning approach to estimate how treatments affect the full distribution of outcomes, not just mean (summary) effects.
  • DCMA learns conditional generative models for both mediators and the outcome from observational data, then uses identification formulas to reconstruct interventional outcome distributions via Monte Carlo forward simulation with noise resampling.
  • The method is designed to capture both classical summary effects and more nuanced distributional differences using metrics such as energy distance and the Wasserstein distance.
  • It provides analytical error bounds that explain how inaccuracies in the learned conditional generative models propagate to errors in the reconstructed interventional outcome distributions.
  • Experiments and real-world applications are used to demonstrate that DCMA is effective in practice.

Abstract

Mediation analysis has traditionally focused on outcome-level summary contrasts, such as mean effects, which may obscure substantial distributional changes induced by complex and nonlinear causal mechanisms. We propose Distributional Causal Mediation Analysis (DCMA), a generative learning framework for identifying and estimating treatment effects on entire outcome distributions transmitted through multiple mediators. DCMA learns conditional generative models for the mediators and the outcome, recovering the relevant conditional distributions from observational data. Leveraging the identification formulas, it reconstructs interventional outcome distributions via Monte Carlo forward simulation by noise resampling, enabling the capture of both classical summary effects and rich distributional contrasts such as energy distance and the Wasserstein distance. Analytical error bounds are derived to decompose how estimation errors in the learned conditional models propagate to the reconstructed interventional outcome distributions. The empirical effectiveness of DCMA is demonstrated through numerical experiments and real-world data applications.