A transformer architecture alteration to incentivise externalised reasoning

arXiv cs.AI / 3/24/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • 研究は、LLMsをより“verbose”な推論者にするための、トランスフォーマーの新しいアーキテクチャ改変(中間層でのearly-exit機構)とポストトレーニング手法を提案している。
  • モデルは前向き計算を途中で打ち切り、次トークンが深い計算なしに予測できる場合はより浅い層で終了するよう学習する。
  • キャリブレーション段階の後、強化学習で“できるだけ早く退出する”ことを促しつつ、タスク性能を維持するよう最適化する。
  • 小規模の推論モデルでの予備結果では、トークンごとに計算量を適応的に削減する挙動が見られる。

Abstract

We propose a new architectural change, and post-training pipeline, for making LLMs more verbose reasoners by teaching a model to truncate forward passes early. We augment an existing transformer architecture with an early-exit mechanism at intermediate layers and train the model to exit at shallower layers when the next token can be predicted without deep computation. After a calibration stage, we incentivise the model to exit as early as possible while maintaining task performance using reinforcement learning. We provide preliminary results to this effect for small reasoning models, showing that they learn to adaptively reduce computations across tokens. We predict that, applied at the right scale, our approach can minimise the amount of excess computation that reasoning models have at their disposal to perform non-myopic planning using their internal activations, reserving this only for difficult-to-predict tokens.