Robust Continual Unlearning against Knowledge Erosion and Forgetting Reversal

arXiv cs.LG / 4/22/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies machine unlearning under a realistic setting where unlearning is performed repeatedly rather than just once, revealing new failure modes.
  • It identifies two key phenomena during continual unlearning: knowledge erosion on retain data and forgetting reversal where previously forgotten samples reappear as recognizable.
  • To address these issues, the authors propose SAFER (StAbility-preserving Forgetting with Effective Regularization), designed to keep representations stable for retain data while enforcing negative logit margins for forget data.
  • Experiments indicate that SAFER reduces both knowledge erosion and forgetting reversal, maintaining stable performance across multiple unlearning phases.
  • The work advances practical privacy-focused unlearning by making it robust to the cumulative effects of repeated unlearning rounds, supporting “right to be forgotten” goals in AI.

Abstract

As a means to balance the growth of the AI industry with the need for privacy protection, machine unlearning plays a crucial role in realizing the ``right to be forgotten'' in artificial intelligence. This technique enables AI systems to remove the influence of specific data while preserving the rest of the learned knowledge. Although it has been actively studied, most existing unlearning methods assume that unlearning is performed only once. In this work, we evaluate existing unlearning algorithms in a more realistic scenario where unlearning is conducted repeatedly, and in this setting, we identify two critical phenomena: (1) Knowledge Erosion, where the accuracy on retain data progressively degrades over unlearning phases, and (2) Forgetting Reversal, where previously forgotten samples become recognizable again in later phases. To address these challenges, we propose SAFER (StAbility-preserving Forgetting with Effective Regularization), a continual unlearning framework that maintains representation stability for retain data while enforcing negative logit margins for forget data. Extensive experiments show that SAFER mitigates not only knowledge erosion but also forgetting reversal, achieving stable performance across multiple unlearning phases.