Multivariate Uncertainty Quantification with Tomographic Quantile Forests

arXiv stat.ML / 4/2/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces Tomographic Quantile Forests (TQF), a nonparametric, uncertainty-aware tree-based model for multivariate predictive distributions.
TQF learns conditional quantiles of directional projections of the target (nᵀy) as functions of inputs x and unit directions n, enabling reconstruction of full multivariate conditional distributions at inference.
It aggregates across many directions and reconstructs the distribution by minimizing sliced Wasserstein distance using an efficient alternating optimization with convex subproblems.
The approach avoids limitations of prior directional-quantile methods by using a single model that covers all directions, without enforcing convex quantile-region constraints.
The authors evaluate TQF on both synthetic and real-world datasets and provide released GitHub source code for reproducibility.

Abstract

Quantifying predictive uncertainty is essential for safe and trustworthy real-world AI deployment. Yet, fully nonparametric estimation of conditional distributions remains challenging for multivariate targets. We propose Tomographic Quantile Forests (TQF), a nonparametric, uncertainty-aware, tree-based regression model for multivariate targets. TQF learns conditional quantiles of directional projections

\mathbf{n}^{\top}\mathbf{y}

as functions of the input

\mathbf{x}

and the unit direction

\mathbf{n}

. At inference, it aggregates quantiles across many directions and reconstructs the multivariate conditional distribution by minimizing the sliced Wasserstein distance via an efficient alternating scheme with convex subproblems. Unlike classical directional-quantile approaches that typically produce only convex quantile regions and require training separate models for different directions, TQF covers all directions with a single model without imposing convexity restrictions. We evaluate TQF on synthetic and real-world datasets, and release the source code on GitHub.