Transformers Can Learn Rules They've Never Seen: Proof of Computation Beyond Interpolation
arXiv cs.LG / 3/19/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- It tests whether transformers can infer rules absent from training data, challenging interpolation-only accounts in two controlled experiments.
- Experiment 1 uses a cellular automaton with an XOR rule and held-out input patterns to show that similarity-based predictors fail, yet a two-layer transformer learns the rule and circuit extraction identifies XOR computation, with multi-step constraint propagation being key.
- Experiment 2 studies symbolic operator chains over integers with one operator pair held out, requiring intermediate-step proofs; across all 49 holdout pairs, the transformer surpasses every interpolation baseline and degrades without intermediate-step supervision.
- The work also demonstrates a standard transformer block can implement exact local Boolean rules, providing an existence proof that transformers can learn and communicate unseen rule structures, while leaving open when such behavior arises in large-scale training.
Related Articles
AI's Economic Impact Falls Short: Addressing the Gap Between Investment and Measurable Growth
Dev.to
The Inception Loop: A Month in the Life of a Self-Improving AI Sidekick
Dev.to
AI Can Write Your Code. Who's Testing Your Thinking?
Dev.to

Astral to Join OpenAI
Dev.to

PearlOS. We gave swarm intelligence a local desktop environment and code control to self-evolve. Has been pretty incredible to see so far. Open source and free if you want your own.
Reddit r/LocalLLaMA