Risk-Calibrated Learning: Minimizing Fatal Errors in Medical AI

arXiv cs.CV / 4/15/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

Deep learning for medical imaging can make “high-confidence but semantically incoherent” mistakes (e.g., malignant vs. benign) that are more damaging than errors caused by normal visual ambiguity.
The paper introduces Risk-Calibrated Learning, which uses a confusion-aware clinical severity matrix integrated into the training objective to explicitly separate visual ambiguity errors from catastrophic structural errors.
The proposed approach reduces critical error rates (false negatives) across four imaging modalities (brain tumor MRI, dermoscopy, breast histopathology, and prostate histopathology) without requiring complex architecture changes.
Experiments show relative safety improvements over state-of-the-art baselines (e.g., Focal Loss) ranging from 20.0% to 92.4%, and the method generalizes across both CNN and Transformer architectures.

Abstract

Deep learning models often achieve expert-level accuracy in medical image classification but suffer from a critical flaw: semantic incoherence. These high-confidence mistakes that are semantically incoherent (e.g., classifying a malignant tumor as benign) fundamentally differ from acceptable errors which stem from visual ambiguity. Unlike safe, fine-grained disagreements, these fatal failures erode clinical trust. To address this, we propose Risk-Calibrated Learning, a technique that explicitly distinguishes between visual ambiguity (fine-grained errors) and catastrophic structural errors. By embedding a confusion-aware clinical severity matrix M into the optimization landscape, our method suppresses critical errors (false negatives) without requiring complex architectural changes. We validate our approach in four different imaging modalities: Brain Tumor MRI, ISIC 2018 (Dermoscopy), BreaKHis (Breast Histopathology), and SICAPv2 (Prostate Histopathology). Extensive experiments demonstrate that our Risk-Calibrated Loss consistently reduces the Critical Error Rate (CER) for all four datasets, achieving relative safety improvements ranging from 20.0% (on breast histopathology) to 92.4% (on prostate histopathology) compared to state-of-the-art baselines such as Focal Loss. These results confirm that our method offers a superior safety-accuracy trade-off across both CNN and Transformer architectures.