SNEAKDOOR: Stealthy Backdoor Attacks against Distribution Matching-based Dataset Condensation

arXiv cs.AI / 4/1/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces “Sneakdoor,” a new stealth-focused backdoor attack targeting distribution matching-based dataset condensation methods.
  • Sneakdoor achieves imperceptibility by exploiting vulnerabilities near class decision boundaries and using a generative module to create input-aware triggers aligned with local feature geometry.
  • The approach is designed to preserve a strong balance between attack success rate, clean test accuracy, and stealthiness, including reducing detectability in both synthetic condensed data and triggered inference samples.
  • Experiments across multiple datasets show Sneakdoor substantially improves “invisibility” while maintaining high attack efficacy.
  • The authors provide an implementation repository for reproducibility and further study of the attack method.

Abstract

Dataset condensation aims to synthesize compact yet informative datasets that retain the training efficacy of full-scale data, offering substantial gains in efficiency. Recent studies reveal that the condensation process can be vulnerable to backdoor attacks, where malicious triggers are injected into the condensation dataset, manipulating model behavior during inference. While prior approaches have made progress in balancing attack success rate and clean test accuracy, they often fall short in preserving stealthiness, especially in concealing the visual artifacts of condensed data or the perturbations introduced during inference. To address this challenge, we introduce Sneakdoor, which enhances stealthiness without compromising attack effectiveness. Sneakdoor exploits the inherent vulnerability of class decision boundaries and incorporates a generative module that constructs input-aware triggers aligned with local feature geometry, thereby minimizing detectability. This joint design enables the attack to remain imperceptible to both human inspection and statistical detection. Extensive experiments across multiple datasets demonstrate that Sneakdoor achieves a compelling balance among attack success rate, clean test accuracy, and stealthiness, substantially improving the invisibility of both the synthetic data and triggered samples while maintaining high attack efficacy. The code is available at https://github.com/XJTU-AI-Lab/SneakDoor.