FastAT Benchmark: A Comprehensive Framework for Fair Evaluation of Fast Adversarial Training Methods

arXiv cs.CV / 4/28/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper introduces the FastAT Benchmark to enable fair, controlled comparison of Fast Adversarial Training (FastAT) methods that aim to reduce the computational cost of standard multi-step approaches like PGD-AT.
  • It enforces three key principles—unified architecture requirements, standardized training settings, and a strict ban on external or synthetic data—to ensure improvements reflect algorithmic advances rather than different experimental conditions.
  • The benchmark includes implementations of 20+ representative FastAT methods in a single codebase, making results directly reproducible and easier to validate.
  • Evaluation uses dual metrics covering both adversarial robustness (e.g., accuracy under PGD, AutoAttack, and CR Attack) and efficiency (GPU training time and peak memory), and experiments on CIFAR-10/100 and Tiny-ImageNet establish reliable baselines.
  • Results indicate that properly designed single-step methods can achieve robustness comparable to or better than PGD-AT at much lower cost, but no single method is best across all dimensions.

Abstract

Fast Adversarial Training (FastAT) seeks to achieve adversarial robustness at a fraction of the computational cost incurred by standard multi-step methods such as PGD-AT. Although numerous FastAT techniques have been proposed in recent years, fair comparison among them remains elusive. Existing benchmarks and public leaderboards typically permit diverse model architectures, varying training configurations, and external data sources, making it unclear whether reported improvements reflect genuine algorithmic advances or merely more favorable experimental conditions. To address this problem, we introduce the FastAT Benchmark, a controlled evaluation framework built on three core design principles: unified architecture requirements, standardized training settings, and strict prohibition of external or synthetic data. The benchmark implements over twenty representative FastAT methods within a single codebase, enabling direct and reproducible comparison. Each method is assessed through a dual-metric evaluation framework that measures both adversarial robustness (accuracy under PGD, AutoAttack, and CR Attack) and computational cost (GPU training time and peak memory footprint). Comprehensive experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet provide reliable baseline measurements and reveal that well-designed single-step methods can match or surpass PGD-AT robustness at substantially lower cost, while no single method dominates across all evaluation dimensions. The complete benchmark, including source code, configuration files, and experimental results, is publicly available to support transparent and fair evaluation of future FastAT research.