AI Navigate

Why AI Coding Agents Waste Half Their Context Window

Reddit r/LocalLLaMA / 3/12/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical Usage

Key Points

  • The author observes that AI coding agents waste a large portion of their context window on orientation tasks (searching for routes, middleware, and types) before they start coding.
  • They reference Liu et al.'s "Lost in the Middle" to explain that models perform best early in the context window, so lengthy orientation degrades later coding quality.
  • The problem is framed as a hill-climbing process where the agent must incrementally gather knowledge, since it cannot know what it doesn’t know.
  • The proposed fix is restructuring codebase documentation into a three-layer hierarchy (task-to-doc index, intent-based directories, and appropriately sized reference material) to enable 1-3 tool calls for orientation instead of 20.
  • The result is a reduction of orientation context usage from about 20-40% down to under 10%, with the author offering to discuss setup details.

I've been running AI coding agents on a large codebase for months and noticed something that bugged me. Every time I gave an agent a task like "add a new API endpoint," it would spend 15-20 tool calls just figuring out where things are: grepping for routes, reading middleware files, checking types, reading more files. By the time it actually started writing code, it had already burned through a huge chunk of its context window.

I found out how much context position really matters. There's research (Liu et al., "Lost in the Middle") showing models like Llama and Claude have much stronger reasoning start of their context window. So all that searching and file-reading happens when the model is sharpest, and the actual coding happens later when attention has degraded. I've seen the same model produce noticeably worse code after 20 orientation calls vs 3.

I started thinking about this as a hill-climbing problem from optimization theory. The agent starts at the bottom with zero context, takes one step (grep), evaluates, takes another step (read file), evaluates again, and repeats until it has enough understanding to act. It can't skip steps because it doesn't know what it doesn't know.

I was surprised that the best fix wasn't better prompts or agent configs. Rather, it was restructuring the codebase documentation into a three-layer hierarchy that an agent can navigate in 1-3 tool calls instead of 20. An index file that maps tasks to docs, searchable directories organized by intent, and right-sized reference material at each depth.

I've gone from 20-40% of context spent on orientation to under 10%, consistently.

Happy to answer questions about the setup or local model specific details.

submitted by /u/notadamking
[link] [comments]