Attack by Unlearning: Unlearning-Induced Adversarial Attacks on Graph Neural Networks
arXiv cs.LG / 3/20/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces the concept of unlearning corruption attacks, showing how privacy-preserving unlearning can create a new attack surface for graph neural networks.
- It shows that an attacker can inject carefully chosen nodes during training and later trigger their deletion, causing accuracy degradation after unlearning while the model appears normal during training.
- The attack is formulated as a bi-level optimization using gradient-based updates and a surrogate model to generate pseudo-labels, enabling stealthy exploitation of the unlearning process.
- Extensive experiments across benchmarks and unlearning methods demonstrate that small, well-designed unlearning requests can cause significant accuracy drops, raising urgent concerns about robustness and regulatory compliance in real-world GNN systems, with source code to be released after acceptance.
Related Articles

Lessons from Academic Plagiarism Tools for SaaS Product Development
Dev.to

ADICはどの種類の革新なのか ―― ドリフト監査デモで見る「事後説明」から「通過条件」への移行**
Qiita

Complete Guide: How To Make Money With Ai
Dev.to

Built a small free iOS app to reduce LLM answer uncertainty with multiple models
Dev.to

Without Valid Data, AI Transformation Is Flying Blind – Why We Need to “Grasp” Work Again
Dev.to