Illocutionary Explanation Planning for Source-Faithful Explanations in Retrieval-Augmented Language Models

arXiv cs.CL / 4/9/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper evaluates source faithfulness and traceability of LLM-generated explanations in retrieval-augmented generation (RAG) for programming education using 90 Stack Overflow questions grounded in three textbooks, benchmarking six LLMs with source-adherence metrics.
  • Results show that non-RAG models have 0% median source adherence, while baseline RAG achieves only modest median adherence (22–40%), indicating explanations often remain only partially grounded in the cited sources.
  • Building on illocutionary theory, the authors propose illocutionary macro-planning and implement it via chain-of-illocution prompting (CoI), which decomposes a query into implicit explanatory sub-questions to better drive retrieval.
  • CoI produces statistically significant improvements in source adherence for most models (up to 63%), though absolute adherence remains moderate and some models see weak or non-significant gains.
  • A user study (165 retained participants) finds that improved source adherence does not reduce user satisfaction, relevance, or perceived correctness, supporting the practical value of the prompting approach.

Abstract

Natural language explanations produced by large language models (LLMs) are often persuasive, but not necessarily scrutable: users cannot easily verify whether the claims in an explanation are supported by evidence. In XAI, this motivates a focus on faithfulness and traceability, i.e., the extent to which an explanation's claims can be grounded in, and traced back to, an explicit source. We study these desiderata in retrieval-augmented generation (RAG) for programming education, where textbooks provide authoritative evidence. We benchmark six LLMs on 90 Stack Overflow questions grounded in three programming textbooks and quantify source faithfulness via source adherence metrics. We find that non Retrieval-Augmented Generation (RAG) models have median source adherence of 0%, while baseline RAG systems still exhibit low median adherence (22-40%, depending on the model). Motivated by Achinstein's illocutionary theory of explanation, we introduce illocutionary macro-planning as a descriptive design principle for source-faithful explanations and instantiate it with chain-of-illocution prompting (CoI), which expands a query into implicit explanatory questions that drive retrieval. Across models, CoI yields statistically significant gains (up to 63%) in source adherence, although absolute adherence remains moderate and the gains are weak or non-significant for some models. A user study with 165 retained participants (220 recruited) indicates that these gains do not harm satisfaction, relevance, or perceived correctness.