The Query Channel: Information-Theoretic Limits of Masking-Based Explanations

arXiv cs.AI / 4/21/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper reinterprets masking-based post-hoc explanation methods (e.g., KernelSHAP, LIME) as a communication problem over a “query channel,” where each masked model evaluation is treated like a channel use.
  • It characterizes the complexity of an explanation through the entropy of the hypothesis class and defines a per-query identification capacity that limits how much information each query can deliver.
  • A strong converse result shows that when the required explanation recovery rate exceeds this capacity, exact recovery becomes impossible: the probability of error goes to one regardless of the explainer/decoder sequence.
  • The authors also provide an achievability theorem, proving that under rates below capacity, reliable exact recovery is possible using a sparse maximum-likelihood decoder.
  • Experiments and benchmarks (including a Monte Carlo mutual-information estimator) show information-theoretic conditions where explanations are theoretically feasible while common convex surrogates can still fail, and they analyze how resolution/tokenization choices and noise degrade the “channel.”

Abstract

Masking-based post-hoc explanation methods, such as KernelSHAP and LIME, estimate local feature importance by querying a black-box model under randomized perturbations. This paper formulates this procedure as communication over a query channel, where the latent explanation acts as a message and each masked evaluation is a channel use. Within this framework, the complexity of the explanation is captured by the entropy of the hypothesis class, while the query interface supplies information at a rate determined by an identification capacity per query. We derive a strong converse showing that, if the explanation rate exceeds this capacity, the probability of exact recovery necessarily converges to one in error for any sequence of explainers and decoders. We also prove an achievability result establishing that a sparse maximum-likelihood decoder attains reliable recovery when the rate lies below capacity. A Monte Carlo estimator of mutual information yields a non-asymptotic query benchmark that we use to compare optimal decoding with Lasso- and OLS-based procedures that mirror LIME and KernelSHAP. Experiments reveal a range of query budgets where information theory permits reliable explanations but standard convex surrogates still fail. Finally, we interpret super-pixel resolution and tokenization for neural language models as a source-coding choice that sets the entropy of the explanation and show how Gaussian noise and nonlinear curvature degrade the query channel, induce waterfall and error-floor behavior, and render high-resolution explanations unattainable.