AI Navigate

Math needs thinking time, everyday knowledge needs memory, and a new Transformer architecture aims to deliver both

THE DECODER / 3/22/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • German researchers propose a Transformer variant that lets the model decide how many times to think about a problem, i.e., adjust its reasoning steps dynamically.
  • When paired with an external memory component, the architecture outperforms larger models on math tasks.
  • The work argues that combining deliberate thinking time with memory enables better math problem solving than current standard Transformers.
  • If realized at scale, the approach could lead to more efficient models that achieve comparable accuracy with fewer parameters or less compute.

A German research team lets Transformer models decide for themselves how many times they think about a problem. Combined with additional memory, the approach outperforms larger models on math problems.

The article Math needs thinking time, everyday knowledge needs memory, and a new Transformer architecture aims to deliver both appeared first on The Decoder.