Guideline2Graph: Profile-Aware Multimodal Parsing for Executable Clinical Decision Graphs

arXiv cs.LG / 4/6/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

Guideline2Graph proposes a decomposition-first pipeline to convert multimodal, branching clinical practice guidelines into executable clinical decision support (CDS) decision graphs with preserved cross-page continuity.
The method uses topology-aware chunking, interface-constrained chunk graph generation with explicit entry/terminal interfaces, and provenance-preserving global aggregation to keep induced control flow auditable and consistent.
Unlike one-shot or mostly local/text-centric LLM/VLM extraction approaches, it applies semantic deduplication and global consolidation to better represent end-to-end guideline control flow.
Evaluated on an adjudicated prostate-guideline benchmark using matched inputs and the same underlying VLM backbone, the approach substantially improves graph quality metrics (e.g., edge/triplet precision/recall and node recall) versus existing methods.
The authors conclude the approach is promising but note evidence is currently limited to a single adjudicated prostate guideline, motivating broader multi-guideline validation.

Abstract

Clinical practice guidelines are long, multimodal documents whose branching recommendations are difficult to convert into executable clinical decision support (CDS), and one-shot parsing often breaks cross-page continuity. Recent LLM/VLM extractors are mostly local or text-centric, under-specifying section interfaces and failing to consolidate cross-page control flow across full documents into one coherent decision graph. We present a decomposition-first pipeline that converts full-guideline evidence into an executable clinical decision graph through topology-aware chunking, interface-constrained chunk graph generation, and provenance-preserving global aggregation. Rather than relying on single-pass generation, the pipeline uses explicit entry/terminal interfaces and semantic deduplication to preserve cross-page continuity while keeping the induced control flow auditable and structurally consistent. We evaluate on an adjudicated prostate-guideline benchmark with matched inputs and the same underlying VLM backbone across compared methods. On the complete merged graph, our approach improves edge and triplet precision/recall from

19.6\%/16.1\%

in existing models to

69.0\%/87.5\%

, while node recall rises from

78.1\%

93.8\%

. These results support decomposition-first, auditable guideline-to-CDS conversion on this benchmark, while current evidence remains limited to one adjudicated prostate guideline and motivates broader multi-guideline validation.