Shortcut Learning in Glomerular AI: Adversarial Penalties Hurt, Entropy Helps

arXiv cs.CV / 4/10/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The study investigates whether lupus nephritis glomerular lesion classifiers in renal pathology AI rely on staining identity as a “shortcut” to handle distribution shift.
  • Using a multi-center, multi-stain dataset of 9,674 glomerular patches (three centers, four stains: PAS, H&E, Jones, Trichrome), the authors find that stain-only classification is trivially learnable, indicating strong potential shortcut availability.
  • A Bayesian dual-head setup that predicts both lesion and stain shows that making stain penalties adversarial (varying supervision strength/sign) can worsen stain performance and inflate uncertainty without improving lesion robustness, suggesting adversarial stain penalties may be harmful.
  • By contrast, label-free regularization via entropy maximization on the stain head keeps stain predictions near chance while preserving lesion accuracy and calibration, reducing the risk of stain-related bias.
  • The authors conclude that careful multi-stain dataset curation and a deployment-friendly Bayesian dual-head architecture with entropy-based stain regularization can safeguard glomerular AI against stain drift.

Abstract

Stain variability is a pervasive source of distribution shift and potential shortcut learning in renal pathology AI. We ask whether lupus nephritis glomerular lesion classifiers exploit stain as a shortcut, and how to mitigate such bias without stain or site labels. We curate a multi-center, multi-stain dataset of 9{,}674 glomerular patches (224\times224) from 365 WSIs across three centers and four stains (PAS, H\&E, Jones, Trichrome), labeled as proliferative vs.\ non-proliferative. We evaluate Bayesian CNN and ViT backbones with Monte Carlo dropout in three settings: (1) stain-only classification; (2) a dual-head model jointly predicting lesion and stain with supervised stain loss; and (3) a dual-head model with label-free stain regularization via entropy maximization on the stain head. In (1), stain identity is trivially learnable, confirming a strong candidate shortcut. In (2), varying the strength and sign of stain supervision strongly modulates stain performance but leaves lesion metrics essentially unchanged, indicating no measurable stain-driven shortcut learning on this multi-stain, multi-center dataset, while overly adversarial stain penalties inflate predictive uncertainty. In (3), entropy-based regularization holds stain predictions near chance without degrading lesion accuracy or calibration. Overall, a carefully curated multi-stain dataset can be inherently robust to stain shortcuts, and a Bayesian dual-head architecture with label-free entropy regularization offers a simple, deployment-friendly safeguard against potential stain-related drift in glomerular AI.