Between Resolution Collapse and Variance Inflation: Weighted Conformal Anomaly Detection in Low-Data Regimes

arXiv stat.ML / 3/25/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that standard conformal anomaly detection can be unreliable under real-world distribution shifts, motivating a weighted conformal framework for low-data, non-stationary regimes.
  • It identifies a critical trade-off introduced by weighting: as weights concentrate on relevant calibration points (reducing effective sample size), p-values can become overly conservative, while smoothing to fix discreteness can inflate variance and hide anomalies.
  • The authors propose a continuous inference relaxation that uses continuous weighted kernel density estimation to decouple local adaptation from tail resolution.
  • By relaxing finite-sample exactness to asymptotic validity, the method removes Monte Carlo variability and recovers statistical power lost to discretization.
  • Empirical results suggest improved anomaly discovery (including where discrete baselines produce no detections) and higher statistical power while preserving valid marginal error control in practice.

Abstract

Standard conformal anomaly detection provides marginal finite-sample guarantees under the assumption of exchangeability . However, real-world data often exhibit distribution shifts, necessitating a weighted conformal approach to adapt to local non-stationarity. We show that this adaptation induces a critical trade-off between the minimum attainable p-value and its stability. As importance weights localize to relevant calibration instances, the effective sample size decreases. This can render standard conformal p-values overly conservative for effective error control, while the smoothing technique used to mitigate this issue introduces conditional variance, potentially masking anomalies. We propose a continuous inference relaxation that resolves this dilemma by decoupling local adaptation from tail resolution via continuous weighted kernel density estimation. While relaxing finite-sample exactness to asymptotic validity, our method eliminates Monte Carlo variability and recovers the statistical power lost to discretization. Empirical evaluations confirm that our approach not only restores detection capabilities where discrete baselines yield zero discoveries, but outperforms standard methods in statistical power while maintaining valid marginal error control in practice.