When AI and Experts Agree on Error: Intrinsic Ambiguity in Dermatoscopic Images
arXiv cs.CV / 4/2/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The study proposes that AI misclassifications in dermatoscopic diagnosis may reflect intrinsic visual ambiguity in the images rather than solely model bias.
- Across multiple CNN architectures, the authors isolate a subset of images that are systematically misclassified by all models, showing this error pattern occurs significantly more than random chance.
- Expert dermatologists exhibit a major performance collapse on these AI-misclassified “difficult” images, with agreement to ground truth dropping sharply (Cohen’s kappa 0.08 vs. 0.61 for controls) and inter-rater reliability weakening (Fleiss kappa 0.275 vs. 0.456).
- The research identifies image quality as a key factor driving both model and human failure modes, suggesting data/quality limitations can undermine both automated and expert diagnosis.
- To support transparency and reproducibility, the authors publicly release the data, code, and trained models alongside the arXiv submission.
Related Articles

Black Hat Asia
AI Business

Unitree's IPO
ChinaTalk

Did you know your GIGABYTE laptop has a built-in AI coding assistant? Meet GiMATE Coder 🤖
Dev.to

Benchmarking Batch Deep Reinforcement Learning Algorithms
Dev.to
A bug in Bun may have been the root cause of the Claude Code source code leak.
Reddit r/LocalLLaMA