A Kernel Nonconformity Score for Multivariate Conformal Prediction

arXiv stat.ML / 4/24/2026

📰 NewsModels & Research

Key Points

  • The paper proposes a new Multivariate Kernel Score (MKS) for multivariate conformal prediction that converts residual vectors into scalars while preserving the geometry of the residual distribution.
  • The authors show MKS closely resembles Gaussian process posterior variance, creating a link between Bayesian-style uncertainty quantification and frequentist coverage guarantees.
  • MKS is reformulated as an anisotropic Maximum Mean Discrepancy (MMD) that interpolates between kernel density estimation and covariance-weighted distance.
  • The work proves finite-sample coverage guarantees and derives convergence rates governed by the effective rank of a kernel covariance operator, enabling dimension-free adaptation.
  • Experiments on regression tasks indicate MKS substantially reduces prediction-region volume versus ellipsoidal baselines while maintaining nominal coverage, with larger improvements in higher dimensions and for stricter coverage levels.

Abstract

Multivariate conformal prediction requires nonconformity scores that compress residual vectors into scalars while preserving certain implicit geometric structure of the residual distribution. We introduce a Multivariate Kernel Score (MKS) that produces prediction regions that explicitly adapt to this geometry. We show that the proposed score resembles the Gaussian process posterior variance, unifying Bayesian uncertainty quantification with the coverage guarantees of frequentist-type. Moreover, the MKS can be decomposed into an anisotropic Maximum Mean Discrepancy (MMD) that interpolates between kernel density estimation and covariance-weighted distance. We prove finite-sample coverage guarantees and establish convergence rates that depend on the effective rank of the kernel-based covariance operator rather than the ambient dimension, enabling dimension-free adaptation. On regression tasks, the MKS reduces the volume of prediction region significantly, compared to ellipsoidal baselines while maintaining nominal coverage, with larger gains at higher dimensions and tighter coverage levels.