LUMOS: Universal Semi-Supervised OCT Retinal Layer Segmentation with Hierarchical Reliable Mutual Learning

arXiv cs.CV / 4/8/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • LUMOS is a universal semi-supervised framework for OCT retinal layer segmentation designed to address scarce annotations and the presence of heterogeneous label granularities across datasets.
  • The method uses a dual-decoder network with a hierarchical prompting strategy (DDN-HPS) to mitigate pseudo-label noise during semi-supervised training.
  • It also introduces reliable progressive multi-granularity learning (RPML), which weights region-level reliability and progressively moves from easier to harder tasks to enable stable cross-granularity alignment.
  • Experiments on six OCT datasets show that LUMOS substantially improves performance over existing approaches and achieves strong cross-domain and cross-granularity generalization.

Abstract

Optical Coherence Tomography (OCT) layer segmentation faces challenges due to annotation scarcity and heterogeneous label granularities across datasets. While semi-supervised learning helps alleviate label scarcity, existing methods typically assume a fixed granularity, failing to fully exploit cross-granularity supervision. This paper presents LUMOS, a semi-supervised universal OCT retinal layer segmentation framework based on a Dual-Decoder Network with a Hierarchical Prompting Strategy (DDN-HPS) and Reliable Progressive Multi-granularity Learning (RPML). DDN-HPS combines a dual-branch architecture with a multi-granularity prompting strategy to effectively suppress pseudo-label noise propagation. Meanwhile, RPML introduces region-level reliability weighing and a progressive training approach that guides the model from easier to more difficult tasks, ensuring the reliable selection of cross-granularity consistency targets, thereby achieving stable cross-granularity alignment. Experiments on six OCT datasets demonstrate that LUMOS largely outperforms existing methods and exhibits exceptional cross-domain and cross-granularity generalization capability.