Flow-based Generative Modeling of Potential Outcomes and Counterfactuals

arXiv stat.ML / 4/16/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces PO-Flow, a continuous normalizing flow framework for causal inference that targets individualized prediction of potential outcomes and counterfactuals from observational data.
  • PO-Flow jointly models potential outcome distributions and factual-conditioned counterfactual outcomes, using an encode-decode mechanism where factual outcomes are encoded and decoded under alternative treatments.
  • The method is trained via flow matching and supports likelihood-based evaluation, enabling uncertainty-aware assessment of predicted outcomes.
  • The authors provide a recovery guarantee under specific assumptions and report strong empirical performance on benchmark datasets across multiple potential-outcome causal inference tasks.
  • Overall, the approach unifies individualized potential outcome prediction, conditional average treatment effect estimation, and counterfactual prediction within a single flow-based model.

Abstract

Predicting potential and counterfactual outcomes from observational data is central to individualized decision-making, particularly in clinical settings where treatment choices must be tailored to each patient rather than guided solely by population averages. We propose PO-Flow, a continuous normalizing flow (CNF) framework for causal inference that jointly models potential outcome distributions and factual-conditioned counterfactual outcomes. Trained via flow matching, PO-Flow provides a unified approach to individualized potential outcome prediction, conditional average treatment effect estimation, and counterfactual prediction. By encoding an observed factual outcome and decoding under an alternative treatment, PO-Flow provides an encode-decode mechanism for factual-conditioned counterfactual prediction. In addition, PO-Flow supports likelihood-based evaluation of potential outcomes, enabling uncertainty-aware assessment of predictions. A supporting recovery guarantee is established under certain assumptions, and empirical results on benchmark datasets demonstrate strong performance across a range of causal inference tasks within the potential outcomes framework.