EviSearch: A Human in the Loop System for Extracting and Auditing Clinical Evidence for Systematic Reviews

arXiv cs.CL / 4/17/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • EviSearch is a multi-agent, human-in-the-loop system that extracts ontology-aligned clinical evidence tables from trial PDFs while preserving layout and figures.
  • It guarantees per-cell provenance for auditability by combining a PDF-query agent, a retrieval-guided search agent, and a reconciliation module that enforces page-level verification when outputs disagree.
  • Evaluated on a clinician-curated oncology benchmark, EviSearch improves extraction accuracy over strong parsed-text baselines while achieving broad provenance attribution coverage.
  • The system logs reconciliation decisions and reviewer edits to generate preference/supervision signals that can bootstrap iterative improvements of extraction models.
  • EviSearch targets living systematic review workflows by accelerating evidence synthesis, reducing manual curation burden, and offering a safer auditable path for LLM-based extraction integration.

Abstract

We present EviSearch, a multi-agent extraction system that automates the creation of ontology-aligned clinical evidence tables directly from native trial PDFs while guaranteeing per-cell provenance for audit and human verification. EviSearch pairs a PDF-query agent (which preserves rendered layout and figures) with a retrieval-guided search agent and a reconciliation module that forces page-level verification when agents disagree. The pipeline is designed for high-precision extraction across multimodal evidence sources (text, tables, figures) and for generating reviewer-actionable provenance that clinicians can inspect and correct. On a clinician-curated benchmark of oncology trial papers, EviSearch substantially improves extraction accuracy relative to strong parsed-text baselines while providing comprehensive attribution coverage. By logging reconciler decisions and reviewer edits, the system produces structured preference and supervision signals that bootstrap iterative model improvement. EviSearch is intended to accelerate living systematic review workflows, reduce manual curation burden, and provide a safe, auditable path for integrating LLM-based extraction into evidence synthesis pipelines.