Contextual Distributionally Robust Optimization with Causal and Continuous Structure: An Interpretable and Tractable Approach

arXiv stat.ML / 4/3/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a framework for contextual distributionally robust optimization that leverages the underlying distribution’s causal and continuous structure to produce interpretable, tractable decision rules.
  • It introduces the causal Sinkhorn discrepancy (CSD), an entropy-regularized causal Wasserstein distance designed to encourage continuous transport plans while maintaining causal consistency.
  • Building on CSD, the authors define a contextual DRO model called Causal Sinkhorn DRO (Causal-SDRO) and derive a strong dual formulation where the worst-case distribution is represented as a mixture of Gibbs distributions.
  • To address infinite-dimensional policy optimization, they propose the Soft Regression Forest (SRF) decision rule, combining decision-tree interpretability with a fully parametric, differentiable, Lipschitz-smooth model.
  • They develop an efficient stochastic compositional gradient algorithm for the parametric Causal-SDRO and provide convergence guarantees, supported by experiments on synthetic and real-world datasets showing improved performance and interpretability.

Abstract

In this paper, we introduce a framework for contextual distributionally robust optimization (DRO) that considers the causal and continuous structure of the underlying distribution by developing interpretable and tractable decision rules that prescribe decisions using covariates. We first introduce the causal Sinkhorn discrepancy (CSD), an entropy-regularized causal Wasserstein distance that encourages continuous transport plans while preserving the causal consistency. We then formulate a contextual DRO model with a CSD-based ambiguity set, termed Causal Sinkhorn DRO (Causal-SDRO), and derive its strong dual reformulation where the worst-case distribution is characterized as a mixture of Gibbs distributions. To solve the corresponding infinite-dimensional policy optimization, we propose the Soft Regression Forest (SRF) decision rule, which approximates optimal policies within arbitrary measurable function spaces. The SRF preserves the interpretability of classical decision trees while being fully parametric, differentiable, and Lipschitz smooth, enabling intrinsic interpretation from both global and local perspectives. To solve the Causal-SDRO with parametric decision rules, we develop an efficient stochastic compositional gradient algorithm that converges to an \varepsilon-stationary point at a rate of O(\varepsilon^{-4}), matching the convergence rate of standard stochastic gradient descent. Finally, we validate our method through numerical experiments on synthetic and real-world datasets, demonstrating its superior performance and interpretability.

Contextual Distributionally Robust Optimization with Causal and Continuous Structure: An Interpretable and Tractable Approach | AI Navigate