Towards interpretable AI with quantum annealing feature selection

arXiv cs.LG / 4/29/2026

📰 NewsModels & Research

Key Points

  • The paper aims to improve interpretability of deep learning models, focusing on explaining image-classification decisions made by convolutional neural networks.
  • It introduces a method that selects the most important feature maps per prediction by formulating the task as a combinatorial optimization problem.
  • The combinatorial problem is encoded as a quantum constrained optimization problem and solved using quantum annealing.
  • Compared with leading explainable AI baselines such as GradCAM and GradCAM++, the method shows better class disentanglement, yielding clearer and more transparent decision boundaries.
  • The authors also analyze the quantum annealing algorithm’s computational behavior (e.g., minimum energy gap and success probability) to explain why the approach works in practice.

Abstract

Deep learning models are used in critical applications, in which mistakes can have serious consequences. Therefore, it is crucial to understand how and why models generate predictions. This understanding provides useful information to check whether the model is learning the right patterns, detect biases in the data, improve model design, and build systems that can be trusted. This work proposes a new method for interpreting Convolutional Neural Networks in image classification tasks. The approach works by selecting the most important feature maps that contribute to each prediction. To solve this combinatorial problem, we encode it into a quantum constrained optimization problem and propose to solve it using quantum annealing. We evaluate our method against the state-of-the-art explainable AI techniques, specifically GradCAM and GradCAM++, and observe an improved class disentanglement, i.e. the model's decision boundaries become more distinct and its reasoning more transparent. This demonstrates that our approach enhances the quality of explanations, making it easier to understand which features the model relies on for specific predictions. In addition, we study the computational behavior of the quantum annealing algorithm. Specifically, we analyze the minimum energy gap of the system during computation and the probability that the algorithm finds the correct solution. These analyses provide theoretical insight into why the method works effectively in practice.