Guideline-grounded retrieval-augmented generation for ophthalmic clinical decision support

arXiv cs.AI / 2026/3/24

💬 オピニオンSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

要点

The paper introduces Oph-Guid-RAG, a multimodal visual retrieval-augmented generation system tailored for ophthalmology clinical question answering and decision support using ophthalmic guidelines as evidence sources.
It treats each guideline page as an independent evidence unit and retrieves the page images directly to preserve critical visual structure such as tables, flowcharts, and layout information.
The method uses a controllable retrieval framework (routing and filtering) plus query decomposition/rewriting, reranking, and multimodal reasoning to selectively incorporate external evidence and reduce irrelevant noise.
Evaluated on HealthBench with doctor-based scoring, the approach shows substantial gains on the hard subset versus GPT-5.x, with improvements reported in overall score and accuracy.
Ablation results indicate that reranking, routing, and retrieval design are key drivers of stable performance, and the authors note that further work is needed for completeness and robustness in real clinical settings.

Abstract

In this work, we propose Oph-Guid-RAG, a multimodal visual RAG system for ophthalmology clinical question answering and decision support. We treat each guideline page as an independent evidence unit and directly retrieve page images, preserving tables, flowcharts, and layout information. We further design a controllable retrieval framework with routing and filtering, which selectively introduces external evidence and reduces noise. The system integrates query decomposition, query rewriting, retrieval, reranking, and multimodal reasoning, and provides traceable outputs with guideline page references. We evaluate our method on HealthBench using a doctor-based scoring protocol. On the hard subset, our approach improves the overall score from 0.2969 to 0.3861 (+0.0892, +30.0%) compared to GPT-5.2, and achieves higher accuracy, improving from 0.5956 to 0.6576 (+0.0620, +10.4%). Compared to GPT-5.4, our method achieves a larger accuracy gain of +0.1289 (+24.4%). These results show that our method is more effective on challenging cases that require precise, evidence-based reasoning. Ablation studies further show that reranking, routing, and retrieval design are critical for stable performance, especially under difficult settings. Overall, we show how combining visionbased retrieval with controllable reasoning can improve evidence grounding and robustness in clinical AI applications,while pointing out that further work is needed to be more complete.

光電融合の製造受託に野心、新光電気「TSMCにはない魅力を」

日経XTECH

日立製作所と日立エナジー、エネルギーインフラ向けAIサービスを提供

日経XTECH

マイクロソフト、Claude CodeやGitHub Copilotに「このアプリをデプロイせよ」と指示すればAIが最適なインフラ構成やサービスでデプロイしてくれる「Azure Skills Plugin」公開

Publickey

[野球の予測モデル] 次の1球で何が起こるのかを予測したい

Qiita

なんと397BのAIモデルをiPhoneで動かすことに成功

GIGAZINE

Guideline-grounded retrieval-augmented generation for ophthalmic clinical decision support

要点

Abstract

関連記事

光電融合の製造受託に野心、新光電気「TSMCにはない魅力を」

日立製作所と日立エナジー、エネルギーインフラ向けAIサービスを提供

マイクロソフト、Claude CodeやGitHub Copilotに「このアプリをデプロイせよ」と指示すればAIが最適なインフラ構成やサービスでデプロイしてくれる「Azure Skills Plugin」公開

[野球の予測モデル] 次の1球で何が起こるのかを予測したい

なんと397BのAIモデルをiPhoneで動かすことに成功

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer