Neyman-Pearson multiclass classification under label noise via empirical likelihood
arXiv stat.ML / 3/24/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper studies Neyman-Pearson multiclass classification (NPMC) when training labels are corrupted and only noisy labels are observed, a scenario largely unexplored in prior NPMC work.
- It proposes an empirical-likelihood (EL) approach that links noisy and true label distributions using an exponential tilting density-ratio model, enabling recovery of clean class proportions and posterior probabilities for error control.
- The authors prove statistical guarantees for the maximum EL estimators, including consistency, asymptotic normality, and optimal convergence rates.
- They show that, under mild conditions, the resulting classifier achieves asymptotic Neyman-Pearson-style oracle inequalities with respect to the unknown true labels.
- An EM algorithm is presented for computation, and experiments indicate performance close to an oracle trained on clean labels and notably better than methods that ignore label noise.
Related Articles

Interactive Web Visualization of GPT-2
Reddit r/artificial
Stop Treating AI Interview Fraud Like a Proctoring Problem
Dev.to
[R] Causal self-attention as a probabilistic model over embeddings
Reddit r/MachineLearning
The 5 software development trends that actually matter in 2026 (and what they mean for your startup)
Dev.to
InVideo AI Review: Fast Finished
Dev.to