Neyman-Pearson multiclass classification under label noise via empirical likelihood
arXiv stat.ML / 2026/3/24
💬 オピニオンIdeas & Deep AnalysisModels & Research
要点
- The paper studies Neyman-Pearson multiclass classification (NPMC) when training labels are corrupted and only noisy labels are observed, a scenario largely unexplored in prior NPMC work.
- It proposes an empirical-likelihood (EL) approach that links noisy and true label distributions using an exponential tilting density-ratio model, enabling recovery of clean class proportions and posterior probabilities for error control.
- The authors prove statistical guarantees for the maximum EL estimators, including consistency, asymptotic normality, and optimal convergence rates.
- They show that, under mild conditions, the resulting classifier achieves asymptotic Neyman-Pearson-style oracle inequalities with respect to the unknown true labels.
- An EM algorithm is presented for computation, and experiments indicate performance close to an oracle trained on clean labels and notably better than methods that ignore label noise.

