Magnification-Invariant Image Classification via Domain Generalization and Stable Sparse Embedding Signatures

arXiv cs.CV / 4/29/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper tackles magnification shifts as a key failure mode in histopathology image classification by using a strict patient-disjoint leave-one-magnification-out evaluation on the BreaKHis dataset.
  • A gradient-reversal domain-generalization model outperformed a supervised baseline across held-out magnifications, with the clearest benefit when 200X data were excluded.
  • GAN-based data augmentation (DCGAN-generated patches) produced inconsistent results, helping some splits while hurting others, especially at 400X.
  • The domain-general model improved calibration (lower Brier score: 0.063 vs 0.089) and reduced the size of learned sparse embedding signatures by over threefold (306 vs 1,074 dimensions) while keeping predictive performance essentially unchanged (AUC/F1 nearly equal).
  • Sparse embedding analysis also showed much higher cross-fold reproducibility for domain-general training (Jaccard overlap rising from near-zero to 0.99 between 100X and 200X), indicating more stable and transferable representations for deployment across imaging settings.

Abstract

Magnification shift is a major obstacle to robust histopathology classification, because models trained on one imaging scale often generalize poorly to another. Here, we evaluated this problem on the BreaKHis dataset using a strict patient-disjoint leave-one-magnification-out protocol, comparing supervised baseline, baseline augmented with DCGAN-generated patches, and a gradient-reversal domain-general model designed to preserve discriminative information while suppressing magnification-specific variation. Across held-out magnifications, the domain-general model achieved the strongest overall discrimination and its clearest gain was observed when 200X was held out. By contrast, GAN augmentation produced inconsistent effects, improving some folds but degrading others, particularly at 400X. The domain-general model also yielded the lowest Brier score at 0.063 vs 0.089 at baseline. Sparse embedding analysis further revealed that domain-general training reduced average signature size more than three-fold (306 versus 1,074 dimensions) while preserving equivalent predictive performance (AUC: 0.967 vs 0.965; F1: 0.930 vs 0.931). It also increased cross-fold signature reproducibility from near-zero Jaccard overlap in the baseline to 0.99 between the 100X and 200X folds. These findings show that calibrated, compact, and transferable representations can be learned without added architectural complexity, with clear implications for the reliable deployment of computational pathology models across heterogeneous acquisition settings.