AI Navigate

Breaking the Chain: A Causal Analysis of LLM Faithfulness to Intermediate Structures

arXiv cs.AI / 3/18/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The authors present a causal evaluation protocol to determine whether intermediate structures in schema-guided LLM reasoning causally determine final outputs.
  • In experiments across eight models and three benchmarks, models are self-consistent with their intermediate structures but often fail to update predictions after interventions, up to 60% of cases, revealing fragility of apparent faithfulness.
  • When the final decision is derived from an external tool, this fragility largely disappears, suggesting the structure can influence but not reliably mediate the outcome.
  • Prompts that emphasize the intermediate structure over the original input do not materially close the gap, indicating intermediate structures act as influential context rather than stable causal mediators.

Abstract

Schema-guided reasoning pipelines ask LLMs to produce explicit intermediate structures -- rubrics, checklists, verification queries -- before committing to a final decision. But do these structures causally determine the output, or merely accompany it? We introduce a causal evaluation protocol that makes this directly measurable: by selecting tasks where a deterministic function maps intermediate structures to decisions, every controlled edit implies a unique correct output. Across eight models and three benchmarks, models appear self-consistent with their own intermediate structures but fail to update predictions after intervention in up to 60% of cases -- revealing that apparent faithfulness is fragile once the intermediate structure changes. When derivation of the final decision from the structure is delegated to an external tool, this fragility largely disappears; however, prompts which ask to prioritize the intermediate structure over the original input do not materially close the gap. Overall, intermediate structures in schema-guided pipelines function as influential context rather than stable causal mediators.