Unified Precision-Guaranteed Stopping Rules for Contextual Learning

arXiv stat.ML / 4/10/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies when to stop data collection in contextual learning while still guaranteeing the learned decision policy meets specified precision targets under unknown sampling variances.
  • It introduces unified stopping rules for two accuracy criteria—context-wise precision and aggregate policy-value precision—covering both unstructured and structured linear settings.
  • The method uses generalized likelihood ratio (GLR) statistics for pairwise action comparisons and calibrates sequential decision boundaries with new time-uniform deviation inequalities.
  • Under a Gaussian sampling model, the authors prove finite-sample precision guarantees for both criteria and show via experiments that the rules can reach target accuracy using substantially fewer samples than benchmark approaches.
  • The framework is positioned as broadly applicable to personalized/operations-style decision problems using diverse data sources (historical data, simulations, and real systems) to reduce unnecessary sampling without sacrificing decision quality.

Abstract

Contextual learning seeks to learn a decision policy that maps an individual's characteristics to an action through data collection. In operations management, such data may come from various sources, and a central question is when data collection can stop while still guaranteeing that the learned policy is sufficiently accurate. We study this question under two precision criteria: a context-wise criterion and an aggregate policy-value criterion. We develop unified stopping rules for contextual learning with unknown sampling variances in both unstructured and structured linear settings. Our approach is based on generalized likelihood ratio (GLR) statistics for pairwise action comparisons. To calibrate the corresponding sequential boundaries, we derive new time-uniform deviation inequalities that directly control the self-normalized GLR evidence and thus avoid the conservativeness caused by decoupling mean and variance uncertainty. Under the Gaussian sampling model, we establish finite-sample precision guarantees for both criteria. Numerical experiments on synthetic instances and two case studies demonstrate that the proposed stopping rules achieve the target precision with substantially fewer samples than benchmark methods. The proposed framework provides a practical way to determine when enough information has been collected in personalized decision problems. It applies across multiple data-collection environments, including historical datasets, simulation models, and real systems, enabling practitioners to reduce unnecessary sampling while maintaining a desired level of decision quality.