Does Machine Unlearning Preserve Clinical Safety? A Risk Analysis for Medical Image Classification

arXiv cs.AI / 4/28/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper analyzes how machine unlearning, which selectively removes training data from deployed models, can impact clinical safety in medical image classification rather than only privacy or efficiency metrics.
  • It finds that common unlearning approaches (Fine-Tuning, Random Labeling, and SalUn) can degrade test performance and increase false-negative rates, potentially raising clinical risk.
  • To address this, the authors introduce SalUn-CRA (Clinical Risk-Aware), modifying SalUn to use entropy-based forgetting for malignant samples in the “forget” set to avoid learning harmful benign associations.
  • Experiments on DermaMNIST and PathMNIST with 20% and 50% training data removal show that SalUn-CRA can achieve clinical risk that is lower or comparable to full retraining while maintaining unlearning effectiveness, using Global Risk metrics with asymmetric error costs.
  • The work argues that clinically asymmetric error costs must be incorporated into unlearning validation for medical AI systems to ensure patient safety and regulatory compliance.

Abstract

The application of Deep Learning in medical diagnosis must balance patient safety with compliance with data protection regulations. Machine Unlearning enables the selective removal of training data from deployed models. However, most methods are validated primarily through efficiency and privacy-oriented metrics, with limited attention to clinically asymmetric error costs. In this work, we investigate how unlearning affects clinical risk in binary medical image classification. We show that standard unlearning strategies (Fine-Tuning, Random Labeling, and SalUn) may reduce test utility while increasing false-negative rates, thereby amplifying clinical risk. To mitigate this, we propose SalUn-CRA (Clinical Risk-Aware), a variant of SalUn that replaces random relabeling with entropy-based forgetting for malignant samples in the forget set, preventing the model from learning harmful benign associations. We evaluate on DermaMNIST and PathMNIST medical image datasets under 20% and 50% data removal. Using Global Risk metrics with asymmetric costs, SalUn-CRA achieves lower or comparable clinical risk to full retraining while preserving unlearning effectiveness. These results suggest that clinical risk should be an integral component of unlearning validation in medical systems.