Benign Overfitting in Adversarial Training for Vision Transformers
arXiv cs.LG / 4/22/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper provides the first theoretical analysis of how adversarial training affects Vision Transformers (ViTs) using simplified ViT architectures, addressing a gap in existing theory.
- It argues that under a specific signal-to-noise ratio condition and a moderate perturbation budget, adversarial training can yield nearly zero robust training loss and low robust generalization error in certain regimes.
- The work reports a “benign overfitting” effect—strong generalization despite overfitting—that had previously been observed primarily for CNNs with adversarial training.
- Experiments on both synthetic and real-world datasets are used to validate the theoretical results and support the proposed conditions.
- Overall, the study links adversarial robustness in ViTs to training dynamics that resemble those known from CNN theory, offering new guidance for understanding and designing robust ViT training.
Related Articles

Just what the doctor ordered: how AI could help China bridge the medical resources gap
SCMP Tech
Why don't Automatic speech Recognition models use prompting? [D]
Reddit r/MachineLearning

Automating Advanced Customization in Your Music Studio
Dev.to

CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos
Dev.to

My AI Agent Over-Corrected Itself — So I Built Metabolic Regulation
Dev.to