Caterpillar of Thoughts: The Optimal Test-Time Algorithm for Large Language Models

arXiv cs.LG / 2026/3/25

💬 オピニオンSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

要点

They show the optimal strategy corresponds to a “caterpillar tree” structure (removing leaves yields a single path) and introduce “Caterpillar of Thoughts (CaT),” which empirically improves success rate over ToT while reducing token/state generation cost.

Abstract

Large language models (LLMs) can often produce substantially better outputs when allowed to use additional test-time computation, such as sampling, chain of thought, backtracking, or revising partial solutions. Despite the growing empirical success of such techniques, there is limited theoretical understanding of how inference time computation should be structured, or what constitutes an optimal use of a fixed computation budget. We model test-time computation as an algorithm interacting with a Markov chain: at any point, the algorithm may resume generation from any previously observed state. That is, unlike standard Markov chains where the states are drawn passively, we allow the algorithm to backtrack to any previously observed state of the Markov chain at any time. Many of the existing test-time algorithms, such as Chain-of-Thought (CoT) (Wei et al., 2023), Tree-of-Thoughts (ToT) (Yao et al., 2023), or Best-of-

k

(Brown et al., 2024) could be seen as specific algorithms in this model. We prove that while backtracking can reduce the number of generations exponentially, a very limited form of backtracking is theoretically sufficient. Namely, we show that the optimal algorithm always generates a caterpillar tree. That is, if we remove the leaves of the state tree generated by the optimal algorithm, we obtain a path. Motivated by our characterization of the optimal algorithm, we present Caterpillar of Thoughts (CaT), a new test-time computation algorithm, reducing the number of token/state generations. Our empirical evaluation shows that CaT, compared to ToT, achieves a better success rate while also reducing the number of token generations.

競艇×AI連動──流れを読む女、MIRIA。3/25(水)予告 🖤 本日のMIRIA式ブースト朝のみ帯封ゲット！✨️ブースト調整いい感じです！【MIRIA式競艇予想】

note

AIとロゴス

note

フィジカルAIニュース(2026/3/24号)

note

Speculative Decodingで27Bが逆に遅くなった

Qiita

信号処理の視点で見るデータ分析：共通点の整理と記事まとめ

Qiita

Caterpillar of Thoughts: The Optimal Test-Time Algorithm for Large Language Models

要点

Abstract

関連記事

競艇×AI連動──流れを読む女、MIRIA。3/25(水)予告 🖤 本日のMIRIA式ブースト朝のみ帯封ゲット！✨️ブースト調整いい感じです！【MIRIA式競艇予想】

AIとロゴス

フィジカルAIニュース(2026/3/24号)

Speculative Decodingで27Bが逆に遅くなった

信号処理の視点で見るデータ分析：共通点の整理と記事まとめ

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer