Bootstrapped Control Limits for Score-Based Concept Drift Control Charts

arXiv stat.ML / 2026/3/24

💬 オピニオンIdeas & Deep AnalysisModels & Research

要点

  • The paper addresses concept drift detection by monitoring changes in the mean of a supervised model’s Fisher score vector using a multivariate exponentially weighted moving average (MEWMA).
  • It improves the calibration of MEWMA control limits by introducing a nested bootstrap procedure that uses the entire initial dataset for model fitting, removing the need for a large out-of-sample holdout set.
  • The authors show standard nested bootstrap calibration can underestimate the variability of the monitoring statistic and propose a 0.632-like correction to better account for this bias.
  • The outer bootstrap loop is designed to be fully parallelizable, making the approach computationally feasible with control-limit setup times comparable to or faster than the previous method.
  • Numerical experiments illustrate that the corrected, bootstrap-calibrated control limits yield advantages over the baseline calibration strategy.

Abstract

Monitoring for changes in a predictive relationship represented by a fitted supervised learning model (i.e., concept drift detection) is a widespread problem in modern data-driven applications. A general and powerful Fisher score-based concept drift approach was recently proposed, in which detecting concept drift reduces to detecting changes in the mean of the model's score vector using a multivariate exponentially weighted moving average (MEWMA). To implement the approach, the initial data must be split into two subsets. The first subset serves as the training sample to which the model is fit, and the second subset serves as an out-of-sample test set from which the MEWMA control limit (CL) is determined. In this paper, we retain the same score-based MEWMA monitoring statistic as the existing method and focus instead on improving the computation of the control limit. We develop a novel nested bootstrap procedure for calibrating the CL that allows the entire initial sample to be used for model fitting, thereby yielding a more accurate baseline model while eliminating the need for a large holdout set. The outer bootstrap loop is fully parallelizable, making the method computationally practical, with CL setup times comparable to or faster than the existing method. We show that a standard nested bootstrap substantially underestimates the variability of the monitoring statistic and develop a 0.632-like correction that appropriately accounts for this. We demonstrate the advantages with numerical examples.