Anatomy-Aware Unsupervised Detection and Localization of Retinal Abnormalities in Optical Coherence Tomography

arXiv cs.LG / 4/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a key bottleneck in OCT (Optical Coherence Tomography) analysis—expert lesion annotations are costly and labor-intensive—by proposing an unsupervised anomaly detection method that does not require abnormality labels.
  • It trains a discrete latent model on normal B-scans to learn healthy OCT anatomical distributions, then detects and localizes abnormalities using reconstruction discrepancies at inference time.
  • To improve robustness in clinical settings, the method adds retinal layer-aware supervision and structured triplet learning to better separate healthy versus pathological representations across different imaging devices and patient groups.
  • Experiments show strong results on the Kermany dataset (AUROC 0.799), improved cross-dataset generalization on Srinivasan (AUROC 0.884), and competitive segmentation performance on the RETOUCH external benchmark (Dice 0.200, mIoU 0.117).
  • Overall, the approach demonstrates reproducibility across institutions and outperforms several prior unsupervised VAE/VQAE/VQGAN-style and anomaly detection baselines.

Abstract

Reliable automated analysis of Optical Coherence Tomography (OCT) imaging is crucial for diagnosing retinal disorders but faces a critical barrier: the need for expensive, labor-intensive expert annotations. Supervised deep learning models struggle to generalize across diverse pathologies, imaging devices, and patient populations due to their restricted vocabulary of annotated abnormalities. We propose an unsupervised anomaly detection framework that learns the normative distribution of healthy retinal anatomy without lesion annotations, directly addressing annotation efficiency challenges in clinical deployment. Our approach leverages a discrete latent model trained on normal B-scans to capture OCT-specific structural patterns. To enhance clinical robustness, we incorporate retinal layer-aware supervision and structured triplet learning to separate healthy from pathological representations, improving model reliability across varied imaging conditions. During inference, anomalies are detected and localized via reconstruction discrepancies, enabling both image and pixel-level identification without requiring disease-specific labels. On the Kermany dataset (AUROC: 0.799), our method substantially outperforms VAE, VQVAE, VQGAN, and f-AnoGAN baselines. Critically, cross-dataset evaluation on Srinivasan achieves AUROC 0.884 with superior generalization, demonstrating robust domain adaptation. On the external RETOUCH benchmark, unsupervised anomaly segmentation achieves competitive Dice (0.200) and mIoU (0.117) scores, validating reproducibility across institutions.