MedFormer-UR: Uncertainty-Routed Transformer for Medical Image Classification

arXiv cs.AI / 4/13/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes MedFormer-UR, a prototype-based Medical Vision Transformer that improves clinical safety by providing calibrated, uncertainty-aware predictions rather than relying only on high accuracy.
It uses a Dirichlet distribution to estimate per-token evidential uncertainty and routes information through the transformer to localize ambiguity in real time.
Uncertainty is integrated into training as an active mechanism that filters out unreliable feature updates, aiming to reduce overconfident behavior common in noisy, imbalanced clinical data.
Class-specific prototypes are employed to keep the embedding space structured so decisions can be made based on visual similarity.
Experiments across mammography, ultrasound, MRI, and histopathology show up to a 35% reduction in expected calibration error (ECE) and improved selective prediction, even when accuracy improvements are modest.

Abstract

To ensure safe clinical integration, deep learning models must provide more than just high accuracy; they require dependable uncertainty quantification. While current Medical Vision Transformers perform well, they frequently struggle with overconfident predictions and a lack of transparency, issues that are magnified by the noisy and imbalanced nature of clinical data. To address this, we enhanced the modified Medical Transformer (MedFormer) that incorporates prototype-based learning and uncertainty-guided routing, by utilizing a Dirichlet distribution for per-token evidential uncertainty, our framework can quantify and localize ambiguity in real-time. This uncertainty is not just an output but an active participant in the training process, filtering out unreliable feature updates. Furthermore, the use of class-specific prototypes ensures the embedding space remains structured, allowing for decisions based on visual similarity. Testing across four modalities (mammography, ultrasound, MRI, and histopathology) confirms that our approach significantly enhances model calibration, reducing expected calibration error (ECE) by up to 35%, and improves selective prediction, even when accuracy gains are modest.