H-Sets: Hessian-Guided Discovery of Set-Level Feature Interactions in Image Classifiers

arXiv cs.AI / 4/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that current feature attribution methods mostly capture marginal effects and miss higher-order feature interactions, which are crucial for interpretability in image classifiers.
  • It introduces H-Sets, a two-stage approach that first uses input Hessians to find locally interacting feature pairs and then recursively merges them into semantically coherent feature sets, using Segment Anything (SAM) segmentation as a spatial grouping prior.
  • It also proposes IDG-Vis, a set-level extension of Integrated Directional Gradients that traces directional gradients along pixel-space paths and aggregates contributions using Harsanyi dividends to attribute each discovered set.
  • Although the Hessian-based detection adds extra computation, experiments on VGG, ResNet, DenseNet, and MobileNet across ImageNet and CUB show that H-Sets produces sparser and more faithful saliency maps than prior interaction attribution methods.

Abstract

Feature attribution methods explain the predictions of deep neural networks by assigning importance scores to individual input features. However, most existing methods focus solely on marginal effects, overlooking feature interactions, where groups of features jointly influence model output. Such interactions are especially important in image classification tasks, where semantic meaning often arises from pixel interdependencies rather than isolated features. Existing interaction-based methods for images are either coarse (e.g., superpixel-only) or, fail to satisfy core interpretability axioms. In this work, we introduce H-Sets, a novel two-stage framework for discovering and attributing higher-order feature interactions in image classifiers. First, we detect locally interacting pairs via input Hessians and recursively merge them into semantically coherent sets; segmentation from Segment Anything (SAM) is used as a spatial grouping prior but can be replaced by other segmentations. Second, we attribute each set with IDG-Vis, a set-level extension of Integrated Directional Gradients that integrates directional gradients along pixel-space paths and aggregates them with Harsanyi dividends. While Hessians introduce additional compute at the detection stage, this targeted cost consistently yields saliency maps that are sparser and more faithful. Evaluations across VGG, ResNet, DenseNet and MobileNet models on ImageNet and CUB datasets show that H-Sets generate more interpretable and faithful saliency maps compared to existing methods.