Conformalized Super Learner

arXiv cs.LG / 4/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a “conformalized” version of the Super Learner (SL) that builds prediction intervals by coupling SL’s ensemble weighting with conformal prediction (CP).
  • It constructs interval predictions by using learner-specific conformity scores and combining them via a weighted majority vote, mirroring the original SL framework.
  • The authors analyze theoretical properties of the resulting SL-based intervals for continuous outcomes under assumptions such as exchangeability, including cases with potential violations.
  • Through simulations, the method is shown to achieve valid finite-sample coverage and competitive accuracy versus the true data-generating process.
  • The paper demonstrates practical value by predicting creatinine levels using socio-demographic, biometric, and laboratory measurements, highlighting gains from capturing nonlinearities, interactions, heteroscedasticity, sparsity, and outlier robustness.

Abstract

The Super Learner (SL) is a widely used ensemble method that combines predictions from a library of learners based on their predictive performance. Interval predictions are of considerable practical interest because they allow uncertainty in predictions produced by an individual learner or an ensemble to be quantified. Several methods have been proposed for constructing interval predictions based on the SL, however, these approaches are typically justified using asymptotic arguments or rely on computationally intensive procedures such as the bootstrap. Conformal prediction (CP) is a machine learning framework for constructing prediction intervals with finite-sample and asymptotic coverage guarantees under mild conditions. We propose coupling CP with the SL through a natural construction that mirrors the original SL framework, using individual learner weights and combining learner-specific conformity scores via a weighted majority vote. We characterize the properties of the resulting SL-based prediction intervals for continuous outcomes. We cover settings under exchangeability, potential violations of exchangeability, and data-generating mechanisms exhibiting heteroscedasticity, sparsity, and other forms of distributional heterogeneity. A comprehensive simulation study shows that the conformalized SL achieves valid finite-sample coverage with competitive performance relative to the true data-generating mechanism. A central contribution of this work is an application to predicting creatinine levels using socio-demographic, biometric, and laboratory measurements. This example demonstrates the benefits of an ensemble with carefully selected learners designed to capture key aspects of complex regression functions, including non-linear effects, interactions, sparsity, heteroscedasticity, and robustness to outliers.R