AI Navigate

Variational Rectification Inference for Learning with Noisy Labels

arXiv cs.LG / 3/19/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • We propose Variational Rectification Inference (VRI) to adapt loss rectification for learning with noisy labels within a meta-learning framework.
  • VRI treats the rectifying vector as a latent variable in a hierarchical Bayesian model, enabling robust loss correction for noisy samples through extra randomness regularization.
  • An amortized meta-network approximates the conditional posterior of the rectifying vector, preventing collapse to a Dirac delta and improving generalization.
  • The framework uses a smooth prior and bi-level optimization to efficiently meta-learn rectification with a set of clean meta-data.
  • Empirical results show improved robustness to label noise, including open-set noise, validating the effectiveness of VRI.

Abstract

Label noise has been broadly observed in real-world datasets. To mitigate the negative impact of overfitting to label noise for deep models, effective strategies (\textit{e.g.}, re-weighting, or loss rectification) have been broadly applied in prevailing approaches, which have been generally learned under the meta-learning scenario. Despite the robustness of noise achieved by the probabilistic meta-learning models, they usually suffer from model collapse that degenerates generalization performance. In this paper, we propose variational rectification inference (VRI) to formulate the adaptive rectification for loss functions as an amortized variational inference problem and derive the evidence lower bound under the meta-learning framework. Specifically, VRI is constructed as a hierarchical Bayes by treating the rectifying vector as a latent variable, which can rectify the loss of the noisy sample with the extra randomness regularization and is, therefore, more robust to label noise. To achieve the inference of the rectifying vector, we approximate its conditional posterior with an amortization meta-network. By introducing the variational term in VRI, the conditional posterior is estimated accurately and avoids collapsing to a Dirac delta function, which can significantly improve the generalization performance. The elaborated meta-network and prior network adhere to the smoothness assumption, enabling the generation of reliable rectification vectors. Given a set of clean meta-data, VRI can be efficiently meta-learned within the bi-level optimization programming. Besides, theoretical analysis guarantees that the meta-network can be efficiently learned with our algorithm. Comprehensive comparison experiments and analyses validate its effectiveness for robust learning with noisy labels, particularly in the presence of open-set noise.