The Library Theorem: How External Organization Governs Agentic Reasoning Capacity

arXiv cs.AI / 2026/3/24

💬 オピニオンSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

要点

The paper studies “externalized” agent reasoning by treating the transformer context window as an I/O page and analyzing how structured retrieval via indexed external memory scales versus sequential scanning.
It proves that tool-augmented agents with reasoning-state indexing can reduce retrieval costs from linear reads ($\Omega(N)$) to logarithmic reads ($O(\log_b N)$), with cumulative costs over $T$ reasoning steps improving from quadratic to near-linear in the indexed case.
Controlled benchmarks across multiple content types and store sizes show the predicted gains, including an abstract-content regime where indexed agents read a near-constant number of pages (about 1) regardless of store size.
The experiments reveal two failure modes: without proper indexing, models struggle to maintain efficient search at scale, and on familiar/encyclopedic content, models may bypass the retrieval protocol and answer from parametric memory, causing excessive token usage even when the index works.
The authors argue for a separation of concerns: use language models to construct the index (leveraging semantic understanding) while using deterministic algorithms to traverse it (avoiding protocol-skipping caused by model understanding).

Abstract

Externalized reasoning is already exploited by transformer-based agents through chain-of-thought, but structured retrieval -- indexing over one's own reasoning state -- remains underexplored. We formalize the transformer context window as an I/O page and prove that tool-augmented agents with indexed external memory achieve exponentially lower retrieval cost than agents restricted to sequential scanning:

O(\log_b N)

versus

\Omega(N)

page reads per query, and

O(T \log_b T)

versus

\Theta(T^2)

cumulative cost over

T

reasoning steps -- a gap that widens as deliberation deepens. We test these predictions on a controlled lookup benchmark across three content types -- random hashes, ordered integers, and encyclopedia entries -- varying store size from 50 to 5,000 items, and replicate key conditions across two model generations (GPT-4o-mini and GPT-5.4). On abstract content, the indexed agent achieves median 1 page read regardless of store size, confirming the

O(1)

prediction. Sorted pages without an index fail to close the gap: the weaker model cannot sustain binary search at scale, and the stronger model achieves near-optimal

\log_2 N

search but still loses to the index by

5\times

. On familiar content (encyclopedia entries), a competing failure mode emerges: the model recognizes the domain, bypasses the retrieval protocol, and generates answers from parametric memory, producing catastrophic token expenditure even when the index is sound. This parametric memory competition dissociates the two cognitive operations that indexing combines: understanding content (where language models excel) and following navigational protocols (where they fail when understanding tempts them to shortcut). The result argues for a separation of concerns: use language models for index construction, where semantic understanding helps, and deterministic algorithms for index traversal, where it hurts.

💡 この記事が使われたインサイト

AIの最新ニュースをまとめた「今日の要点」で、この記事が取り上げられています。

📅 3/24Dailyインサイトを見る →

光電融合の製造受託に野心、新光電気「TSMCにはない魅力を」

日経XTECH

日立製作所と日立エナジー、エネルギーインフラ向けAIサービスを提供

日経XTECH

マイクロソフト、Claude CodeやGitHub Copilotに「このアプリをデプロイせよ」と指示すればAIが最適なインフラ構成やサービスでデプロイしてくれる「Azure Skills Plugin」公開

Publickey

[野球の予測モデル] 次の1球で何が起こるのかを予測したい

Qiita

なんと397BのAIモデルをiPhoneで動かすことに成功

GIGAZINE

The Library Theorem: How External Organization Governs Agentic Reasoning Capacity

要点

Abstract

💡 この記事が使われたインサイト

関連記事

光電融合の製造受託に野心、新光電気「TSMCにはない魅力を」

日立製作所と日立エナジー、エネルギーインフラ向けAIサービスを提供

マイクロソフト、Claude CodeやGitHub Copilotに「このアプリをデプロイせよ」と指示すればAIが最適なインフラ構成やサービスでデプロイしてくれる「Azure Skills Plugin」公開

[野球の予測モデル] 次の1球で何が起こるのかを予測したい

なんと397BのAIモデルをiPhoneで動かすことに成功

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer