VerifAI: A Verifiable Open-Source Search Engine for Biomedical Question Answering

arXiv cs.AI / 4/13/2026

💬 OpinionDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

VerifAIは、RAGで回答を生成しつつ、生成内容を原子的な主張（atomic claims）に分解して根拠となる証拠を検証するポストホックの主張検証機構を組み込んだ、生物医学QA向けのオープンソース専門家システムです。
ハイブリッドな生物医学向け情報検索（IR）モジュール、引用を意識した生成コンポーネント、そして微小な幻覚（hallucination）を検出する検証コンポーネントの3モジュール構成で、HealthVerベンチマークではGPT-4を上回ると報告されています。
fine-tuned NLIエンジンによる検証により、ゼロショット基準より幻覚的な引用を大幅に減らし、各主張に対する検証可能な根拠の系譜（verifiable lineage）を提供します。
本論文ではコード・モデル・データセットを含むフルパイプラインをオープンソース化し、高リスク領域での信頼性あるAI導入を促すことを目的としています。

Abstract

We introduce VerifAI, an open-source expert system for biomedical question answering that integrates retrieval-augmented generation (RAG) with a novel post-hoc claim verification mechanism. Unlike standard RAG systems, VerifAI ensures factual consistency by decomposing generated answers into atomic claims and validating them against retrieved evidence using a fine-tuned natural language inference (NLI) engine. The system comprises three modular components: (1) a hybrid Information Retrieval (IR) module optimized for biomedical queries (MAP@10 of 42.7%), (2) a citation-aware Generative Component fine-tuned on a custom dataset to produce referenced answers, and (3) a Verification Component that detects hallucinations with state-of-the-art accuracy, outperforming GPT-4 on the HealthVer benchmark. Evaluations demonstrate that VerifAI significantly reduces hallucinated citations compared to zero-shot baselines and provides a transparent, verifiable lineage for every claim. The full pipeline, including code, models, and datasets, is open-sourced to facilitate reliable AI deployment in high-stakes domains.

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/13DailyView insight →

Build LLM Guardrails in 3 Lines of Python (No API Key, No Cloud)

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

I Set Up an AI Phone Agent for My Business in 5 Minutes — Here's How

Dev.to

Accenture and Google Cloud unveil Brussels centre to accelerate sovereign AI adoption

Tech.eu

I scaled a pure Spiking Neural Network (SNN) to 1.088B parameters from scratch. Ran out of budget, but here is what I found

Reddit r/LocalLLaMA

VerifAI: A Verifiable Open-Source Search Engine for Biomedical Question Answering

Key Points

Abstract

💡 Insights using this article

Related Articles

Build LLM Guardrails in 3 Lines of Python (No API Key, No Cloud)

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

I Set Up an AI Phone Agent for My Business in 5 Minutes — Here's How

Accenture and Google Cloud unveil Brussels centre to accelerate sovereign AI adoption

I scaled a pure Spiking Neural Network (SNN) to 1.088B parameters from scratch. Ran out of budget, but here is what I found

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer