Efficient Unlearning through Maximizing Relearning Convergence Delay

arXiv cs.LG / 4/13/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that existing machine unlearning evaluation focuses only on prediction changes, and proposes a new metric—relearning convergence delay—to better measure how well a model’s internal understanding of a forgotten dataset has truly been removed.
  • Relearning convergence delay is designed to capture discrepancies in both weight space and prediction space, enabling more comprehensive risk assessment of whether forgotten data can be recovered after unlearning.
  • The authors introduce the Influence Eliminating Unlearning framework, which removes the forgetting set’s influence by degrading performance on that set while using weight decay and noise injection to preserve accuracy on the retaining set.
  • Experiments across both classification and generative unlearning tasks show improved performance and stronger resistance to relearning compared with prior approaches and metrics.
  • The work also includes theoretical guarantees such as exponential convergence and upper bounds, supporting the method’s effectiveness beyond empirical results.

Abstract

Machine unlearning poses challenges in removing mislabeled, contaminated, or problematic data from a pretrained model. Current unlearning approaches and evaluation metrics are solely focused on model predictions, which limits insight into the model's true underlying data characteristics. To address this issue, we introduce a new metric called relearning convergence delay, which captures both changes in weight space and prediction space, providing a more comprehensive assessment of the model's understanding of the forgotten dataset. This metric can be used to assess the risk of forgotten data being recovered from the unlearned model. Based on this, we propose the Influence Eliminating Unlearning framework, which removes the influence of the forgetting set by degrading its performance and incorporates weight decay and injecting noise into the model's weights, while maintaining accuracy on the retaining set. Extensive experiments show that our method outperforms existing metrics and our proposed relearning convergence delay metric, approaching ideal unlearning performance. We provide theoretical guarantees, including exponential convergence and upper bounds, as well as empirical evidence of strong retention and resistance to relearning in both classification and generative unlearning tasks.