Fundus Image-based Glaucoma Screening via Retinal Knowledge-Oriented Dynamic Multi-Level Feature Integration

arXiv cs.CV / 4/15/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a knowledge-guided glaucoma screening framework for color fundus photography that addresses limitations of purely data-driven deep learning models in heterogeneous clinical datasets.
  • It uses a tri-branch architecture to combine global retinal context, optic disc/cup structural features, and dynamically localized pathological cues rather than relying on fixed anatomical regions.
  • A Dynamic Window Mechanism is introduced to adaptively locate diagnostically informative image regions, improving reliability when pathological signs fall outside predefined areas.
  • The model incorporates retinal anatomical priors via a Knowledge-Enhanced Convolutional Attention module that leverages priors extracted from a pre-trained foundation model to guide attention learning.
  • Experiments on the AIROGS dataset report an AUC of 98.5% and accuracy of 94.6%, with additional multi-dataset tests on SMDG-19 showing strong cross-domain generalization.

Abstract

Automated diagnosis based on color fundus photography is essential for large-scale glaucoma screening. However, existing deep learning models are typically data-driven and lack explicit integration of retinal anatomical knowledge, which limits their robustness across heterogeneous clinical datasets. Moreover, pathological cues in fundus images may appear beyond predefined anatomical regions, making fixed-region feature extraction insufficient for reliable diagnosis. To address these challenges, we propose a retinal knowledge-oriented glaucoma screening framework that integrates dynamic multi-scale feature learning with domain-specific retinal priors. The framework adopts a tri-branch structure to capture complementary retinal representations, including global retinal context, structural features of the optic disc/cup, and dynamically localized pathological regions. A Dynamic Window Mechanism is devised to adaptively identify diagnostically informative regions, while a Knowledge-Enhanced Convolutional Attention Module incorporates retinal priors extracted from a pre-trained foundation model to guide attention learning. Extensive experiments on the large-scale AIROGS dataset demonstrate that the proposed method outperforms diverse baselines, achieving an AUC of 98.5% and an accuracy of 94.6%. Additional evaluations on multiple datasets from the SMDG-19 benchmark further confirm its strong cross-domain generalization capability, indicating that knowledge-guided attention combined with adaptive lesion localization can significantly improve the robustness of automated glaucoma screening systems.