Robust Learning with Optimal Error

arXiv cs.LG / 4/6/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • 論文は、敵対的ノイズ下での学習において、ランダム化した仮説(randomized hypotheses)を用いることで決定論的仮説よりも最適誤り率を大幅に改善できることを示している。
  • η-rateのmalicious noiseでは、最適誤り率を1/2·η/(1-η)まで下げ、決定論的仮説の最適誤りより改善率が1/2倍になるとしており、Cesa-Bianchiら(1999)の未解決問題を解決した。
  • η-rateのnasty noiseでは、分布非依存(distribution-independent)学習で最適誤り率3/2·η、固定分布(fixed-distribution)でηを達成し、決定論的仮説の2ηから改善して、Bshoutyら(2002)が指摘したギャップを埋めた。
  • η-rateのagnostic noiseや関連するnasty classification noiseでも最適誤り率ηを示し、決定論的仮説の2ηに対して改善しており、サンプル計数はVC次元に線形、過剰誤り(inverse excess error)に多項式で、(固定分布nasty noiseを除き)経験リスク最小化(ERM)オラクルへのアクセスを前提に時間効率も備える。

Abstract

We construct algorithms with optimal error for learning with adversarial noise. The overarching theme of this work is that the use of \textsl{randomized} hypotheses can substantially improve upon the best error rates achievable with deterministic hypotheses. - For \eta-rate malicious noise, we show the optimal error is \frac{1}{2} \cdot \eta/(1-\eta), improving on the optimal error of deterministic hypotheses by a factor of 1/2. This answers an open question of Cesa-Bianchi et al. (JACM 1999) who showed randomness can improve error by a factor of 6/7. - For \eta-rate nasty noise, we show the optimal error is \frac{3}{2} \cdot \eta for distribution-independent learners and \eta for fixed-distribution learners, both improving upon the optimal 2 \eta error of deterministic hypotheses. This closes a gap first noted by Bshouty et al. (Theoretical Computer Science 2002) when they introduced nasty noise and reiterated in the recent works of Klivans et al. (NeurIPS 2025) and Blanc et al. (SODA 2026). - For \eta-rate agnostic noise and the closely related nasty classification noise model, we show the optimal error is \eta, improving upon the optimal 2\eta error of deterministic hypotheses. All of our learners have sample complexity linear in the VC-dimension of the concept class and polynomial in the inverse excess error. All except for the fixed-distribution nasty noise learner are time efficient given access to an oracle for empirical risk minimization.