A Two-Stage LLM Framework for Accessible and Verified XAI Explanations

arXiv cs.AI / 4/15/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that LLM-generated XAI narratives often lack guarantees of accuracy, faithfulness, and completeness, and that existing evaluation is too subjective or post-hoc to protect end-users.
  • It proposes a Two-Stage LLM Meta-Verification Framework with an Explainer LLM to convert XAI outputs into natural language, and a Verifier LLM that checks faithfulness, coherence, completeness, and hallucination risk.
  • An iterative refeed loop uses the Verifier’s feedback to refine the narratives, aiming to improve reliability rather than only score explanations after the fact.
  • Experiments across five XAI techniques and datasets, using three families of open-weight LLMs, find that verification helps filter unreliable explanations while improving linguistic accessibility versus using raw XAI outputs.
  • The authors analyze Entropy Production Rate (EPR) during refinement and conclude that Verifier feedback increasingly guides the Explainer toward more stable and coherent reasoning.

Abstract

Large Language Models (LLMs) are increasingly used to translate the technical outputs of eXplainable Artificial Intelligence (XAI) methods into accessible natural-language explanations. However, existing approaches often lack guarantees of accuracy, faithfulness, and completeness. At the same time, current efforts to evaluate such narratives remain largely subjective or confined to post-hoc scoring, offering no safeguards to prevent flawed explanations from reaching end-users. To address these limitations, this paper proposes a Two-Stage LLM Meta-Verification Framework that consists of (i) an Explainer LLM that converts raw XAI outputs into natural-language narratives, (ii) a Verifier LLM that assesses them in terms of faithfulness, coherence, completeness, and hallucination risk, and (iii) an iterative refeed mechanism that uses the Verifier's feedback to refine and improve them. Experiments across five XAI techniques and datasets, using three families of open-weight LLMs, show that verification is crucial for filtering unreliable explanations while improving linguistic accessibility compared with raw XAI outputs. In addition, the analysis of the Entropy Production Rate (EPR) during the refinement process indicates that the Verifier's feedback progressively guides the Explainer toward more stable and coherent reasoning. Overall, the proposed framework provides an efficient pathway toward more trustworthy and democratized XAI systems.