AI Navigate

Algorithmic Capture, Computational Complexity, and Inductive Bias of Infinite Transformers

arXiv cs.LG / 3/13/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper formally defines Algorithmic Capture (grokking an algorithm) as a neural network’s ability to generalize to arbitrary problem sizes with controllable error and minimal sample adaptation, distinguishing true algorithmic learning from statistical interpolation.
  • It analyzes infinite-width transformers in both the lazy and rich regimes and derives upper bounds on the inference-time computational complexity of the functions these networks can learn.
  • The authors show that transformers, despite universal expressivity, have an inductive bias toward low-complexity algorithms within the Efficient Polynomial Time Heuristic Scheme (EPTHS) class, which prevents them from capturing higher-complexity algorithms while succeeding on simpler tasks like search, copy, and sort.
  • The findings shed light on the limits of transformer-based algorithmic generalization and emphasize how inductive biases constrain which algorithms such networks can effectively acquire.

Abstract

We formally define Algorithmic Capture (i.e., ``grokking'' an algorithm) as the ability of a neural network to generalize to arbitrary problem sizes (T) with controllable error and minimal sample adaptation, distinguishing true algorithmic learning from statistical interpolation. By analyzing infinite-width transformers in both the lazy and rich regimes, we derive upper bounds on the inference-time computational complexity of the functions these networks can learn. We show that despite their universal expressivity, transformers possess an inductive bias towards low-complexity algorithms within the Efficient Polynomial Time Heuristic Scheme (EPTHS) class. This bias effectively prevents them from capturing higher-complexity algorithms, while allowing success on simpler tasks like search, copy, and sort.