Non-monotonicity in Conformal Risk Control

arXiv cs.LG / 4/3/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper analyzes conformal risk control (CRC) when the loss is non-monotonic in the tuning parameter, reflecting realistic settings where coverage and efficiency trade off imperfectly.
  • It shows that CRC validity without monotonicity depends on the calibration sample size relative to the grid resolution used for selecting the tuning parameter.
  • The authors provide finite-sample, distribution-free-style guarantees for bounded losses on a grid of size m, with excess risk scaling as O(√(log m / n)) relative to the target level α.
  • They establish a matching lower bound proving the √(log m / n) rate is minimax optimal, and derive improved bounds under additional structure such as Lipschitz continuity or monotonicity.
  • The study extends results to distribution shift scenarios via importance weighting and includes experiments in multilabel classification and object detection showing more stable risk control when accounting for finite-sample deviations.

Abstract

Conformal risk control (CRC) provides distribution-free guarantees for controlling the expected loss at a user-specified level. Existing theory typically assumes that the loss decreases monotonically with a tuning parameter that governs the size of the prediction set. This assumption is often violated in practice, where losses may behave non-monotonically due to competing objectives such as coverage and efficiency. We study CRC under non-monotone loss functions when the tuning parameter is selected from a finite grid, a common scenario in thresholding or discretized decision rules. Revisiting a known counterexample, we show that the validity of CRC without monotonicity depends on the relationship between the calibration sample size and the grid resolution. In particular, risk control can still be achieved when the calibration sample is sufficiently large relative to the grid. We provide a finite-sample guarantee for bounded losses over a grid of size m, showing that the excess risk above the target level \alpha is of order \sqrt{\log(m)/n}, where n is the calibration sample size. A matching lower bound shows that this rate is minimax optimal. We also derive refined guarantees under additional structural conditions, including Lipschitz continuity and monotonicity, and extend the analysis to settings with distribution shift via importance weighting. Numerical experiments on synthetic multilabel classification and real object detection data illustrate the practical impact of non-monotonicity. Methods that account for finite-sample deviations achieve more stable risk control than approaches based on monotonicity transformations, while maintaining competitive prediction-set sizes.