Loop, Think, & Generalize: Implicit Reasoning in Recurrent-Depth Transformers
arXiv cs.CL / 4/10/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper investigates implicit reasoning in transformer models—how they combine rules or knowledge within a single forward pass—highlighting that standard transformers often fail at implicit multi-hop composition.
- It proposes recurrent-depth transformers, which reuse the same transformer layers for iterative computation, and tests two compositional generalization settings: systematic generalization and depth extrapolation.
- In controlled experiments with models trained from scratch, recurrent-depth transformers outperform vanilla transformers on both challenges, showing improved compositional generalization over parametric knowledge.
- The authors find that systematic generalization emerges via a three-stage “grokking” process (moving from memorization to in-distribution generalization and then to systematic generalization), supported by mechanistic analysis.
- For depth extrapolation, the study shows generalization to deeper hop counts can be enabled by increasing inference-time recurrence, but also identifies a key failure mode called “overthinking,” where excessive recurrence harms predictions.
Related Articles

Black Hat Asia
AI Business

GLM 5.1 tops the code arena rankings for open models
Reddit r/LocalLLaMA

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

My Bestie Built a Free MCP Server for Job Search — Here's How It Works
Dev.to
can we talk about how AI has gotten really good at lying to you?
Reddit r/artificial