ICE: Intervention-Consistent Explanation Evaluation with Statistical Grounding for LLMs
arXiv cs.CL / 3/20/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- ICE introduces Intervention-Consistent Explanation (ICE), a framework that compares explanations against matched random baselines via randomized tests under multiple intervention operators, yielding win rates with confidence intervals.
- It evaluates 7 LLMs across 4 English tasks, 6 non-English languages, and 2 attribution methods, and finds faithfulness is operator-dependent with gaps up to 44 percentage points, with deletion inflating estimates on short text but reversing on long text.
- Randomized baselines reveal anti-faithfulness in about one-third of configurations, and faithfulness shows essentially no correlation with human plausibility.
- The study highlights dramatic model-language interactions not explained by tokenization, and the authors release the ICE framework and ICEBench benchmark.
Related Articles

Check out this article on AI-Driven Reporting 2.0: From Manual Bottlenecks to Real-Time Decision Intelligence (2026 Edition)
Dev.to

SYNCAI
Dev.to
How AI-Powered Decision Making is Reshaping Enterprise Strategy in 2024
Dev.to
When AI Grows Up: Identity, Memory, and What Persists Across Versions
Dev.to
AI-Driven Reporting 2.0: From Manual Bottlenecks to Real-Time Decision Intelligence (2026 Edition)
Dev.to