LLM Reasoning as Trajectories: Step-Specific Representation Geometry and Correctness Signals

arXiv cs.CL / 4/8/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper models LLM chain-of-thought as a geometric “trajectory” through representation space, where reasoning steps occupy ordered, step-specific subspaces.
  • It finds that such step-structured organization is present in base models, and reasoning training mostly speeds convergence toward termination-related regions rather than creating entirely new representational structure.
  • Correct vs. incorrect solutions diverge systematically at late reasoning stages, enabling mid-reasoning prediction of final-answer correctness with ROC-AUC as high as 0.87.
  • The work proposes trajectory-based steering, an inference-time method to correct reasoning and control output length using inferred ideal trajectories.
  • Overall, it positions reasoning trajectories as a framework for interpreting, predicting, and controlling how LLMs perform multi-step reasoning.

Abstract

This work characterizes large language models' chain-of-thought generation as a structured trajectory through representation space. We show that mathematical reasoning traverses functionally ordered, step-specific subspaces that become increasingly separable with layer depth. This structure already exists in base models, while reasoning training primarily accelerates convergence toward termination-related subspaces rather than introducing new representational organization. While early reasoning steps follow similar trajectories, correct and incorrect solutions diverge systematically at late stages. This late-stage divergence enables mid-reasoning prediction of final-answer correctness with ROC-AUC up to 0.87. Furthermore, we introduce trajectory-based steering, an inference-time intervention framework that enables reasoning correction and length control based on derived ideal trajectories. Together, these results establish reasoning trajectories as a geometric lens for interpreting, predicting, and controlling LLM reasoning behavior.