Trust but Verify: Introducing DAVinCI -- A Framework for Dual Attribution and Verification in Claim Inference for Language Models

arXiv cs.AI / 4/25/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

The paper introduces DAVinCI, a Dual Attribution and Verification framework aimed at reducing LLM hallucinations and improving the trustworthiness of generated claims.
DAVinCI works in two stages: attributing each claim to internal model components and external sources, then verifying the claim via entailment-based reasoning with confidence calibration.
Experiments on datasets such as FEVER and CLIMATE-FEVER show that DAVinCI improves multiple metrics (including classification accuracy and F1) by 5–20% over verification-only baselines.
An ablation study identifies key contributors to performance, including evidence span selection, recalibration thresholds, and retrieval quality.
The authors also provide a modular implementation that can be integrated into existing LLM pipelines to support auditable and accountable AI systems.

Abstract

Large Language Models (LLMs) have demonstrated remarkable fluency and versatility across a wide range of NLP tasks, yet they remain prone to factual inaccuracies and hallucinations. This limitation poses significant risks in high-stakes domains such as healthcare, law, and scientific communication, where trust and verifiability are paramount. In this paper, we introduce DAVinCI - a Dual Attribution and Verification framework designed to enhance the factual reliability and interpretability of LLM outputs. DAVinCI operates in two stages: (i) it attributes generated claims to internal model components and external sources; (ii) it verifies each claim using entailment-based reasoning and confidence calibration. We evaluate DAVinCI across multiple datasets, including FEVER and CLIMATE-FEVER, and compare its performance against standard verification-only baselines. Our results show that DAVinCI significantly improves classification accuracy, attribution precision, recall, and F1-score by 5-20%. Through an extensive ablation study, we isolate the contributions of evidence span selection, recalibration thresholds, and retrieval quality. We also release a modular DAVinCI implementation that can be integrated into existing LLM pipelines. By bridging attribution and verification, DAVinCI offers a scalable path to auditable, trustworthy AI systems. This work contributes to the growing effort to make LLMs not only powerful but also accountable.

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/25DailyView insight →

Navigating WooCommerce AI Integrations: Lessons for Agencies & Developers from a Bluehost Conflict

Dev.to

Underwhelming or underrated? DeepSeek V4 shows “impressive” gains

SCMP Tech

Claude Code: Hooks, Subagents, and Skills — Complete Guide

Dev.to

Finding the Gold: An AI Framework for Highlight Detection

Dev.to

Debugging AI Agents in Production: ADK+Gemini Cloud Assist | Google Cloud NEXT '26

Dev.to

Trust but Verify: Introducing DAVinCI -- A Framework for Dual Attribution and Verification in Claim Inference for Language Models

Key Points

Abstract

💡 Insights using this article

Related Articles

Navigating WooCommerce AI Integrations: Lessons for Agencies & Developers from a Bluehost Conflict

Underwhelming or underrated? DeepSeek V4 shows “impressive” gains

Claude Code: Hooks, Subagents, and Skills — Complete Guide

Finding the Gold: An AI Framework for Highlight Detection

Debugging AI Agents in Production: ADK+Gemini Cloud Assist | Google Cloud NEXT '26

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer