Accumulated Aggregated D-Optimal Designs for Estimating Main Effects in Black-Box Models

arXiv stat.ML / 4/23/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper reframes main-effect estimation in black-box explainable ML as an experimental design problem, showing that many existing methods mainly differ by how they choose evaluation locations.
  • It introduces A2D2E, an estimator that uses accumulated aggregated D-optimal hypercube designs to reduce the variance of main-effect estimates and improve robustness.
  • A2D2E is model-agnostic and does not require predictor differentiability, while providing a closed-form estimator with computational complexity comparable to existing approaches.
  • The authors prove consistency with the same population target as ALE, and also extend the guarantee to scenarios where only a surrogate model is available.
  • Extensive simulations indicate A2D2E outperforms ALE-based methods, especially when feature correlations are high, addressing key practical failure modes like OOD sensitivity and instability from correlated features.

Abstract

Estimating how individual input variables affect the output of a black-box model is a central task in explainable machine learning. However, existing methods suffer from two key limitations: sensitivity to out-of-distribution (OOD) evaluations, which arises when query points are placed far from the data manifold, and instability under feature correlation, which can lead to unreliable effect estimates in practice. We introduce a unified view of main effect estimation as a design problem, which reveals that all existing methods differ only in their choice of evaluation locations. Building on this formulation, we propose A2D2E, an Estimator based on Accumulated Aggregated D-Optimal Designs, which replaces evaluations with a D-optimal hypercube design to minimize the variance of main effect estimation. A2D2E is model-agnostic, requires no differentiability of the predictor, and admits a closed-form estimator with complexity comparable to existing approaches. We establish that A2D2E is consistent to the same population target as ALE, and extend this result to the realistic setting where only a surrogate model is available. Through extensive simulations across multiple predictive models and dependence settings, we demonstrate that A2D2E outperforms ALE-based methods, with the largest gains under high feature correlation.