Understanding Uncertainty Sampling via Equivalent Loss

arXiv stat.ML / 4/8/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper revisits uncertainty sampling in active learning, arguing that its common practice is largely heuristic because there is no agreed-upon, loss-consistent definition of “uncertainty.”
  • It introduces an “equivalent loss” framework that ties a chosen uncertainty measure to the original task loss, showing that uncertainty sampling effectively optimizes this derived objective.
  • The authors validate existing uncertainty measures through two properties—surrogate property and loss convexity—to clarify when the measures are theoretically well-aligned with the underlying learning goal.
  • Under preserved convexity, the paper provides sample complexity results for the equivalent loss and converts these into binary-loss guarantees via a surrogate link.
  • It further proves asymptotic superiority of uncertainty sampling over passive learning under mild conditions, and outlines potential extensions to pool-based, multi-class, and regression settings.

Abstract

Uncertainty sampling is a prevalent active learning algorithm that queries sequentially the annotations of data samples which the current prediction model is uncertain about. However, the usage of uncertainty sampling has been largely heuristic: There is no consensus on the proper definition of ``uncertainty'' for a specific task under a specific loss, nor a theoretical guarantee that prescribes a standard protocol to implement the algorithm. In this work, we systematically examine uncertainty sampling algorithms in the binary classification problem via a notion of equivalent loss which depends on the used uncertainty measure and the original loss function, and establish that an uncertainty sampling algorithm is optimizing against such an equivalent loss. The perspective verifies the properness of existing uncertainty measures from two aspects: surrogate property and loss convexity. When the convexity is preserved, we give a sample complexity result for the equivalent loss, and later translate it into a binary loss guarantee via the surrogate link function. We prove the asymptotic superiority of the uncertainty sampling against the passive learning via this approach under mild conditions. We also discuss some potential extensions, including pool-based setting and potential generalization to the multi-class classification as well as the regression problems.