ShapShift: Explaining Model Prediction Shifts with Subgroup Conditional Shapley Values

arXiv stat.ML / 4/14/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The paper introduces “ShapShift,” a Shapley-value-based method to explain how changes in input distributions shift a model’s average predictions.
It attributes prediction shifts to changes in the conditional probabilities of interpretable data subgroups defined by decision-tree structure, first for single decision trees with exact explanations at split nodes.
The method is extended to tree ensembles by selecting the most explanatory tree and modeling remaining residual effects.
A model-agnostic variant uses surrogate trees trained with a new objective, enabling the approach to be applied to non-tree models such as neural networks.
Although exact computation can be costly, the authors describe approximation techniques and report that the method yields simple, faithful, near-complete explanations useful for monitoring models in changing environments.

Abstract

Changes in input distribution can induce shifts in the average predictions of machine learning models. Such prediction shifts may impact downstream business outcomes (e.g. a bank's loan approval rate), so understanding their causes can be crucial. We propose \ours{}: a Shapley value method for attributing prediction shifts to changes in the conditional probabilities of interpretable subgroups of data, where these subgroups are defined by the structure of decision trees. We initially apply this method to single decision trees, providing exact explanations based on conditional probability changes at split nodes. Next, we extend it to tree ensembles by selecting the most explanatory tree and accounting for residual effects. Finally, we propose a model-agnostic variant using surrogate trees grown with a novel objective function, allowing application to models like neural networks. While exact computation can be intensive, approximation techniques enable practical application. We show that \ours{} provides simple, faithful, and near-complete explanations of prediction shifts across model classes, aiding model monitoring in dynamic environments.