$\oslash$ Source Models Leak What They Shouldn't $\nrightarrow$: Unlearning Zero-Shot Transfer in Domain Adaptation Through Adversarial Optimization

arXiv cs.CV / 4/10/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper highlights a privacy risk in source-free domain adaptation, where a source-trained vision model can unintentionally leak source-domain–exclusive class knowledge into a target domain without seeing source data.
  • Experiments show existing SFDA approaches can achieve strong zero-shot performance on source-exclusive classes present only in the target domain labels, indicating inadvert information transfer.
  • The authors introduce SCADA-UL (Unlearning Source-exclusive ClAsses in Domain Adaptation) to address this setting, arguing that prior machine unlearning methods don’t properly handle distribution shifts.
  • SCADA-UL uses adversarially generated “forget class” samples, combined with a rescaled labeling strategy and adversarial optimization, to unlearn those classes during adaptation.
  • The work includes continual and partially-unknown forget-class variants and reports that SCADA-UL reaches retraining-level unlearning performance while outperforming baselines, with code released on GitHub.

Abstract

The increasing adaptation of vision models across domains, such as satellite imagery and medical scans, has raised an emerging privacy risk: models may inadvertently retain and leak sensitive source-domain specific information in the target domain. This creates a compelling use case for machine unlearning to protect the privacy of sensitive source-domain data. Among adaptation techniques, source-free domain adaptation (SFDA) calls for an urgent need for machine unlearning (MU), where the source data itself is protected, yet the source model exposed during adaptation encodes its influence. Our experiments reveal that existing SFDA methods exhibit strong zero-shot performance on source-exclusive classes in the target domain, indicating they inadvertently leak knowledge of these classes into the target domain, even when they are not represented in the target data. We identify and address this risk by proposing an MU setting called SCADA-UL: Unlearning Source-exclusive ClAsses in Domain Adaptation. Existing MU methods do not address this setting as they are not designed to handle data distribution shifts. We propose a new unlearning method, where an adversarially generated forget class sample is unlearned by the model during the domain adaptation process using a novel rescaled labeling strategy and adversarial optimization. We also extend our study to two variants: a continual version of this problem setting and to one where the specific source classes to be forgotten may be unknown. Alongside theoretical interpretations, our comprehensive empirical results show that our method consistently outperforms baselines in the proposed setting while achieving retraining-level unlearning performance on benchmark datasets. Our code is available at https://github.com/D-Arnav/SCADA