Short Chains, Deep Thoughts: Balancing Reasoning Efficiency and Intra-Segment Capability via Split-Merge Optimization
arXiv cs.CL / 5/4/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that long, verbose reasoning chains in Large Reasoning Models cause major latency and compute costs, and proposes addressing redundancy rather than simply limiting token length.
- It introduces CoSMo (Consistency-Guided Split-Merge Optimization), which uses a split-merge algorithm to dynamically merge redundant reasoning segments and split where logical gaps appear to preserve coherence.
- The authors use structure-aligned reinforcement learning with a new segment-level budget to train models to maintain efficient reasoning structures over time.
- Experiments across multiple benchmarks and model backbones show CoSMo improves accuracy by 3.3 points while reducing segment usage by 28.7% on average versus reasoning-efficiency baselines.
Related Articles
AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs
Anthropic News

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI
The Verge

CLMA Frame Test
Dev.to

You Are Right — You Don't Need CLAUDE.md
Dev.to

Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions
Dev.to