VLM-in-the-Loop: A Plug-In Quality Assurance Module for ECG Digitization Pipelines

arXiv cs.CV / 4/2/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces “VLM-in-the-Loop,” a plug-in quality assurance module that can wrap any ECG digitization backend with closed-loop VLM feedback without modifying the underlying digitization system.
  • Its core technique, tool grounding, anchors VLM judgments in quantitative evidence produced by domain-specific signal analysis tools to make assessments more reliable on real-world images.
  • In an ablation study using 200 records with paired ground truth, tool grounding increased verdict consistency from 71% to 89% and improved fidelity separation (ΔPCC from 0.03 to 0.08), with gains replicated across three different VLMs.
  • When deployed across four digitization backends, the module improved outcomes across the board, including raising the fraction of improved borderline and recovered failed limb leads and nearly doubling valid leads per image on one pipeline.
  • On 428 real clinical HCM images, the integrated system achieved 98.0% “Excellent quality,” and the approach is positioned as domain-parametric for other tasks with objectively measurable quality criteria.

Abstract

ECG digitization could unlock billions of archived clinical records, yet existing methods collapse on real-world images despite strong benchmark numbers. We introduce \textbf{VLM-in-the-Loop}, a plug-in quality assurance module that wraps any digitization backend with closed-loop VLM feedback via a standardized interface, requiring no modification to the underlying digitizer. The core mechanism is \textbf{tool grounding}: anchoring VLM assessment in quantitative evidence from domain-specific signal analysis tools. In a controlled ablation on 200 records with paired ground truth, tool grounding raises verdict consistency from 71\% to 89\% and doubles fidelity separation (\DeltaPCC 0.03 \rightarrow 0.08), with the effect replicating across three VLMs (Claude Opus~4, GPT-4o, Gemini~2.5 Pro), confirming a pattern-level rather than model-specific gain. Deployed across four backends, the module improves every one: 29.4\% of borderline leads improved on our pipeline; 41.2\% of failed limb leads recovered on ECG-Digitiser; valid leads per image doubled on Open-ECG-Digitizer (2.5 \rightarrow 5.8). On 428 real clinical HCM images, the integrated system reaches 98.0\% Excellent quality. Both the plug-in architecture and tool-grounding mechanism are domain-parametric, suggesting broader applicability wherever quality criteria are objectively measurable.