HiMix: Hierarchical Artifact-aware Mixup for Generalized Synthetic Image Detection

arXiv cs.CV / 5/1/2026

📰 NewsModels & Research

Key Points

  • The paper introduces HiMix, a unified framework for improving generalized Synthetic Image Detection (SID) under realistic and diverse synthetic image generation.
  • HiMix expands the training distribution using Mixup-driven Distributional Augmentation (MDA), creating continuous transitional samples between real and fake images to better cover low-confidence regions.
  • It uses pixel-wise mixup to smoothly perturb semantics, aiming to increase the model’s sensitivity to low-level artifacts that differ across generators.
  • A Hierarchical Artifact-aware Representation (HAR) module aggregates artifact cues at both global and local scales via cross-layer integration and coarse-to-fine feature fusion.
  • Experiments across multiple benchmarks show state-of-the-art results, including well-separated logits that improve generalization to unseen forgeries.

Abstract

The rapid evolution of generative models has enabled the creation of highly realistic and diverse synthetic images, posing significant challenges to reliable and generalizable Synthetic Image Detection (SID). However, existing detectors are typically trained on limited and biased datasets, resulting in poor generalization to unseen generators. To address this issue, we propose HiMix, a unified framework that enhances generalization by expanding the training distribution and promoting artifact-aware representations. Specifically, the Mixup-driven Distributional Augmentation (MDA) module constructs continuous transitional samples between real and fake images, improving coverage of low-confidence regions and exposing the model to more challenging samples, while the pixel-wise mixup operation smoothly perturbs semantics to enhance sensitivity to low-level artifacts. Moreover, the Hierarchical Artifact-aware Representation (HAR) module aggregates artifact information from both global and local levels through cross-layer integration and coarse-to-fine feature fusion, enabling the extraction of discriminative forgery representations under diverse distributions. Extensive experiments across multiple benchmarks demonstrate that HiMix achieves state-of-the-art performance, establishing well-separated logits for improved generalization to unseen forgeries.