Guideline-grounded retrieval-augmented generation for ophthalmic clinical decision support
arXiv cs.AI / 2026/3/24
💬 オピニオンSignals & Early TrendsIdeas & Deep AnalysisModels & Research
要点
- The paper introduces Oph-Guid-RAG, a multimodal visual retrieval-augmented generation system tailored for ophthalmology clinical question answering and decision support using ophthalmic guidelines as evidence sources.
- It treats each guideline page as an independent evidence unit and retrieves the page images directly to preserve critical visual structure such as tables, flowcharts, and layout information.
- The method uses a controllable retrieval framework (routing and filtering) plus query decomposition/rewriting, reranking, and multimodal reasoning to selectively incorporate external evidence and reduce irrelevant noise.
- Evaluated on HealthBench with doctor-based scoring, the approach shows substantial gains on the hard subset versus GPT-5.x, with improvements reported in overall score and accuracy.
- Ablation results indicate that reranking, routing, and retrieval design are key drivers of stable performance, and the authors note that further work is needed for completeness and robustness in real clinical settings.
