One night I hit the token limit with Codex and realized most of the cost was coming from context reloading, not actual work.
So I started experimenting with a small context engine around it: - persistent memory - context planning - failure tracking - task-specific memory - and eventually domain “mods” (UX, frontend, etc)
At the end it stopped feeling like using an assistant and more like working with a small dev team.
The article goes through all the iterations (some of them a bit chaotic, not gonna lie).
Curious to hear how others here are dealing with context / token usage when vibe coding.
Repo here if anyone wants to dig into it: here
[link] [comments]
