Why AI Coding Agents Waste Half Their Context Window

Reddit r/LocalLLaMA / 3/12/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical Usage

共有:

Key Points

The author observes that AI coding agents waste a large portion of their context window on orientation tasks (searching for routes, middleware, and types) before they start coding.
They reference Liu et al.'s "Lost in the Middle" to explain that models perform best early in the context window, so lengthy orientation degrades later coding quality.
The problem is framed as a hill-climbing process where the agent must incrementally gather knowledge, since it cannot know what it doesn’t know.
The proposed fix is restructuring codebase documentation into a three-layer hierarchy (task-to-doc index, intent-based directories, and appropriately sized reference material) to enable 1-3 tool calls for orientation instead of 20.
The result is a reduction of orientation context usage from about 20-40% down to under 10%, with the author offering to discuss setup details.

I've been running AI coding agents on a large codebase for months and noticed something that bugged me. Every time I gave an agent a task like "add a new API endpoint," it would spend 15-20 tool calls just figuring out where things are: grepping for routes, reading middleware files, checking types, reading more files. By the time it actually started writing code, it had already burned through a huge chunk of its context window.

I found out how much context position really matters. There's research (Liu et al., "Lost in the Middle") showing models like Llama and Claude have much stronger reasoning start of their context window. So all that searching and file-reading happens when the model is sharpest, and the actual coding happens later when attention has degraded. I've seen the same model produce noticeably worse code after 20 orientation calls vs 3.

I started thinking about this as a hill-climbing problem from optimization theory. The agent starts at the bottom with zero context, takes one step (grep), evaluates, takes another step (read file), evaluates again, and repeats until it has enough understanding to act. It can't skip steps because it doesn't know what it doesn't know.

I was surprised that the best fix wasn't better prompts or agent configs. Rather, it was restructuring the codebase documentation into a three-layer hierarchy that an agent can navigate in 1-3 tool calls instead of 20. An index file that maps tasks to docs, searchable directories organized by intent, and right-sized reference material at each depth.

I've gone from 20-40% of context spent on orientation to under 10%, consistently.

Happy to answer questions about the setup or local model specific details.

submitted by /u/notadamking
[link] [comments]

What 81,000 people want from AI

Anthropic News

ラピダス、半導体設計AIエージェント「国内2社海外1社が使用中」

日経XTECH

「AIで雇用は増える」「AIの進化はツールがけん引」、5つのAI潮流を解説

日経XTECH

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成

日経XTECH

中国AI企業が他社製AIを「ただ乗り蒸留」か米社が主張、安全保障リスクも

日経XTECH

Why AI Coding Agents Waste Half Their Context Window

Key Points

Related Articles

What 81,000 people want from AI

ラピダス、半導体設計AIエージェント「国内2社海外1社が使用中」

「AIで雇用は増える」「AIの進化はツールがけん引」、5つのAI潮流を解説

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成

中国AI企業が他社製AIを「ただ乗り蒸留」か米社が主張、安全保障リスクも

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Related Articles

What 81,000 people want from AI

ラピダス、半導体設計AIエージェント「国内2社海外1社が使用中」

「AIで雇用は増える」「AIの進化はツールがけん引」、5つのAI潮流を解説

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成

中国AI企業が他社製AIを「ただ乗り蒸留」か 米社が主張、安全保障リスクも

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

中国AI企業が他社製AIを「ただ乗り蒸留」か米社が主張、安全保障リスクも