Degradation-Consistent Paired Training for Robust AI-Generated Image Detection

arXiv cs.CV / 4/14/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • AI-generated image detectors often fail when test images undergo real-world corruptions like JPEG compression, Gaussian blur, or resolution downsampling, and existing state-of-the-art approaches mainly rely on augmentation rather than an explicit robustness objective.
  • The paper proposes Degradation-Consistent Paired Training (DCPT), which builds paired clean/degraded views and enforces robustness via feature consistency (cosine-distance minimization) and prediction consistency (symmetric KL divergence alignment).
  • DCPT requires no additional parameters and introduces zero inference overhead, making it a lightweight training-strategy improvement.
  • On the Synthbuster benchmark (9 generators across 8 degradation conditions), DCPT increases degraded-condition average accuracy by 9.1 percentage points versus a baseline without paired training, with the largest gains under JPEG compression.
  • Ablations suggest that simply adding architectural components can overfit on limited data, while explicitly improving the training objective is more effective for degradation robustness.

Abstract

AI-generated image detectors suffer significant performance degradation under real-world image corruptions such as JPEG compression, Gaussian blur, and resolution downsampling. We observe that state-of-the-art methods, including B-Free, treat degradation robustness as a byproduct of data augmentation rather than an explicit training objective. In this work, we propose Degradation-Consistent Paired Training (DCPT), a simple yet effective training strategy that explicitly enforces robustness through paired consistency constraints. For each training image, we construct a clean view and a degraded view, then impose two constraints: a feature consistency loss that minimizes the cosine distance between clean and degraded representations, and a prediction consistency loss based on symmetric KL divergence that aligns output distributions across views. DCPT adds zero additional parameters and zero inference overhead. Experiments on the Synthbuster benchmark (9 generators, 8 degradation conditions) demonstrate that DCPT improves the degraded-condition average accuracy by 9.1 percentage points compared to an identical baseline without paired training, while sacrificing only 0.9% clean accuracy. The improvement is most pronounced under JPEG compression (+15.7% to +17.9%). Ablation further reveals that adding architectural components leads to overfitting on limited training data, confirming that training objective improvement is more effective than architectural augmentation for degradation robustness.