Beyond Attack Success Rate: A Multi-Metric Evaluation of Adversarial Transferability in Medical Imaging Models

arXiv cs.CV / 4/21/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • Researchers argue that relying on Attack Success Rate (ASR) alone is insufficient for evaluating adversarial vulnerability in medical imaging models because it ignores perturbation strength, image quality, and transferability across architectures.
  • They conduct a systematic evaluation across four medical imaging datasets (PathMNIST, DermaMNIST, RetinaMNIST, CheXpert) using seven models (including CNNs and Vision Transformers) and seven attack methods across five perturbation budgets.
  • The study finds that perceptual/distortion metrics (e.g., PSNR and SSIM) strongly correlate with each other but show minimal correlation with ASR for both CNNs and ViTs.
  • The paper concludes that adversarial robustness and transferability should be assessed using multi-metric evaluation frameworks that capture not just whether attacks succeed, but also how they succeed and the associated computational/perturbation overheads.
  • The findings suggest that a single binary metric cannot reliably represent adversarial behavior, especially as medical AI increasingly shifts from CNNs to transformer-based architectures.

Abstract

While deep learning systems are becoming increasingly prevalent in medical image analysis, their vulnerabilities to adversarial perturbations raise serious concerns for clinical deployment. These vulnerability evaluations largely rely on Attack Success Rate (ASR), a binary metric that indicates solely whether an attack is successful. However, the ASR metric does not account for other factors, such as perturbation strength, perceptual image quality, and cross-architecture attack transferability, and therefore, the interpretation is incomplete. This gap requires consideration, as complex, large-scale deep learning systems, including Vision Transformers (ViTs), are increasingly challenging the dominance of Convolutional Neural Networks (CNNs). These architectures learn differently, and it is unclear whether a single metric, e.g., ASR, can effectively capture adversarial behavior. To address this, we perform a systematic empirical study on four medical image datasets: PathMNIST, DermaMNIST, RetinaMNIST, and CheXpert. We evaluate seven models (VGG-16, ResNet-50, DenseNet-121, Inception-v3, DeiT, Swin Transformer, and ViT-B/16) against seven attack methods at five perturbation budgets, measuring ASR, Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), and L_2 perturbation magnitude. Our findings show a consistent pattern: perceptual and distortion metrics are strongly associated with one another and exhibit minimal correlation with ASR. This applies to both CNNs and ViTs. The results demonstrate that ASR alone is an inadequate indicator of adversarial robustness and transferability. Consequently, we argue that a thorough assessment of adversarial risk in medical AI necessitates multi-metric frameworks that encompass not only the attack efficacy but also its methodology and associated overheads.