Revisit, Extend, and Enhance Hessian-Free Influence Functions

arXiv stat.ML / 3/24/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

Key Points

  • Influence functions are reviewed as a way to estimate how individual training samples affect model behavior without expensive retraining, using first-order Taylor approximations.
  • The paper explains why directly applying influence functions to deep, non-convex models is difficult (Hessian inversion can be costly or ill-defined) and revisits TracIn as a practical approximation that replaces the inverse Hessian with an identity matrix.
  • It provides theoretical/insight-based reasoning for why TracIn’s simple approximation can work despite the limitations of Hessian-based methods in deep networks.
  • The authors extend TracIn to new evaluation goals including fairness and robustness, and further improve it via an ensemble strategy.
  • Experiments on synthetic data and large-scale evaluations show TracIn’s effectiveness for noisy label detection, selecting subsets for large language model fine-tuning, and defending against adversarial attacks.

Abstract

Influence functions serve as crucial tools for assessing sample influence in model interpretation, subset training set selection, noisy label detection, and more. By employing the first-order Taylor extension, influence functions can estimate sample influence without the need for expensive model retraining. However, applying influence functions directly to deep models presents challenges, primarily due to the non-convex nature of the loss function and the large size of model parameters. This difficulty not only makes computing the inverse of the Hessian matrix costly but also renders it non-existent in some cases. Various approaches, including matrix decomposition, have been explored to expedite and approximate the inversion of the Hessian matrix, with the aim of making influence functions applicable to deep models. In this paper, we revisit a specific, albeit naive, yet effective approximation method known as TracIn. This method substitutes the inverse of the Hessian matrix with an identity matrix. We provide deeper insights into why this simple approximation method performs well. Furthermore, we extend its applications beyond measuring model utility to include considerations of fairness and robustness. Finally, we enhance TracIn through an ensemble strategy. To validate its effectiveness, we conduct experiments on synthetic data and extensive evaluations on noisy label detection, sample selection for large language model fine-tuning, and defense against adversarial attacks.