Arithmetic OOD Failure Unfolds in Stages in Minimal GPTs
arXiv cs.CL / 3/31/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that relying on a single arithmetic held-out score can hide qualitatively different out-of-distribution (OOD) failure modes in addition tasks.
- Using a controlled minimal GPT trained on exhaustive 2-digit addition, the authors show that 3-digit generalization fails in multiple staged breakdowns rather than one monolithic issue.
- The first failure stage is a layout barrier: an absolute-position dependence collapses under a pure 3-digit layout shift, with mixed-layout exposure being the main intervention that weakens it.
- The second stage is carry-semantics: after addressing layout, the hundreds position functions more like a carry flag than a true semantic hundreds digit, as supported by targeted carry probes.
- The final stages involve conditional recomposition and residual tens errors, with additional experiments (including a sign-aware tens repair) significantly improving exact match on a hardest thousands-carry suite.
Related Articles
[D] How does distributed proof of work computing handle the coordination needs of neural network training?
Reddit r/MachineLearning

BYOK is not just a pricing model: why it changes AI product trust
Dev.to

AI Citation Registries and Identity Persistence Across Records
Dev.to

Building Real-Time AI Voice Agents with Google Gemini 3.1 Flash Live and VideoSDK
Dev.to

Your Knowledge, Your Model: A Method for Deterministic Knowledge Externalization
Dev.to