How to sketch a learning algorithm

arXiv cs.LG / 4/9/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses the “data deletion problem,” aiming to efficiently predict how an AI model’s outputs would change if a subset of training data were removed after some precomputation.
  • It proposes a data deletion scheme that achieves vanishing prediction error ε in the deep learning setting, with precomputation and prediction only polynomially slower than standard training and inference.
  • The method’s compute and storage costs scale with poly(1/ε), requiring storing and using a polynomial number of model sketches.
  • The authors base correctness on a new “stability” assumption and argue it is compatible with training strong AI models, supported by limited experiments using microGPT.
  • The technical approach introduces locally sketching an arithmetic circuit via higher-order derivatives computed through random complex directions, leveraging forward-mode automatic differentiation, and the code is released.

Abstract

How does the choice of training data influence an AI model? This question is of central importance to interpretability, privacy, and basic science. At its core is the data deletion problem: after a reasonable amount of precomputation, quickly predict how the model would behave in a given situation if a given subset of training data had been excluded from the learning algorithm. We present a data deletion scheme capable of predicting model outputs with vanishing error \varepsilon in the deep learning setting. Our precomputation and prediction algorithms are only \mathrm{poly}(1/\varepsilon) factors slower than regular training and inference, respectively. The storage requirements are those of \mathrm{poly}(1/\varepsilon) models. Our proof is based on an assumption that we call "stability." In contrast to the assumptions made by prior work, stability appears to be fully compatible with learning powerful AI models. In support of this, we show that stability is satisfied in a minimal set of experiments with microgpt. Our code is available at https://github.com/SamSpo1/microgpt-sketch. At a technical level, our work is based on a new method for locally sketching an arithmetic circuit by computing higher-order derivatives in random complex directions. Forward-mode automatic differentiation allows cheap computation of these derivatives.