Attack by Unlearning: Unlearning-Induced Adversarial Attacks on Graph Neural Networks

arXiv cs.LG / 3/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces the concept of unlearning corruption attacks, showing how privacy-preserving unlearning can create a new attack surface for graph neural networks.
It shows that an attacker can inject carefully chosen nodes during training and later trigger their deletion, causing accuracy degradation after unlearning while the model appears normal during training.
The attack is formulated as a bi-level optimization using gradient-based updates and a surrogate model to generate pseudo-labels, enabling stealthy exploitation of the unlearning process.
Extensive experiments across benchmarks and unlearning methods demonstrate that small, well-designed unlearning requests can cause significant accuracy drops, raising urgent concerns about robustness and regulatory compliance in real-world GNN systems, with source code to be released after acceptance.

Abstract

Graph neural networks (GNNs) are widely used for learning from graph-structured data in domains such as social networks, recommender systems, and financial platforms. To comply with privacy regulations like the GDPR, CCPA, and PIPEDA, approximate graph unlearning, which aims to remove the influence of specific data points from trained models without full retraining, has become an increasingly important component of trustworthy graph learning. However, approximate unlearning often incurs subtle performance degradation, which may incur negative and unintended side effects. In this work, we show that such degradations can be amplified into adversarial attacks. We introduce the notion of \textbf{unlearning corruption attacks}, where an adversary injects carefully chosen nodes into the training graph and later requests their deletion. Because deletion requests are legally mandated and cannot be denied, this attack surface is both unavoidable and stealthy: the model performs normally during training, but accuracy collapses only after unlearning is applied. Technically, we formulate this attack as a bi-level optimization problem: to overcome the challenges of black-box unlearning and label scarcity, we approximate the unlearning process via gradient-based updates and employ a surrogate model to generate pseudo-labels for the optimization. Extensive experiments across benchmarks and unlearning algorithms demonstrate that small, carefully designed unlearning requests can induce significant accuracy degradation, raising urgent concerns about the robustness of GNN unlearning under real-world regulatory demands. The source code will be released upon paper acceptance.

Lessons from Academic Plagiarism Tools for SaaS Product Development

Dev.to

ADICはどの種類の革新なのか ―― ドリフト監査デモで見る「事後説明」から「通過条件」への移行**

Qiita

Complete Guide: How To Make Money With Ai

Dev.to

Built a small free iOS app to reduce LLM answer uncertainty with multiple models

Dev.to

Without Valid Data, AI Transformation Is Flying Blind – Why We Need to “Grasp” Work Again

Dev.to

Attack by Unlearning: Unlearning-Induced Adversarial Attacks on Graph Neural Networks

Key Points

Abstract

Related Articles

Lessons from Academic Plagiarism Tools for SaaS Product Development

ADICはどの種類の革新なのか ―― ドリフト監査デモで見る「事後説明」から「通過条件」への移行**

Complete Guide: How To Make Money With Ai

Built a small free iOS app to reduce LLM answer uncertainty with multiple models

Without Valid Data, AI Transformation Is Flying Blind – Why We Need to “Grasp” Work Again

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer