Session Budget Check skill.md and how it could save usage and costs.

Dev.to / 4/7/2026

💬 OpinionTools & Practical Usage

Key Points

  • The article explains why Claude Code users can quickly hit usage limits (“usage limit reached”) due to hidden cost drawdowns from initial prompts and parallel subagent processing.
  • It proposes a session workflow add-on (“skill.md” named session-budget-check) that prompts the system to verify both API token budget and the current session context window before running multi-task plans.
  • The skill’s guidance focuses on preventing mid-execution failures by checking token_tracker.json for API spend tracking and separately confirming remaining context capacity for the session.
  • It recommends running the check proactively before starting plans with 3+ tasks, spawning 2+ subagents, or after the session has already produced multiple large agent outputs.
  • The author claims the skill produces “an immediate difference” in cost/usage outcomes for paid-plan power users and invites feedback from others.

If you've worked with Claude Code and somewhat of a power user on a paid plan, you've more than likely experienced this:

Claude AI usage limit reached, please try again after [time]

Claude's usage limits have been a bit of a hot topic in terms of user disappointment in the black box that is usage limits. Fire off your initial prompt, 21% of your usage gone in a single instance. Parallel subagent processing- from 21% to 46% in a single turn. As frustrating as it can be, there are few tasks a user MUST do to not burn up 100% of the current session limit in 20 minutes. Checking your context window, creating new sessions at around 15 messages and keeping up with where you are in the process (to make sure your incomplete code changes don't sit for 5 hours as you await for your limit to refresh) may seem daunting. Here's a skill.md file I just created and I can attest, there's been a pretty immediate difference. Feel free to plug in to Claude Code and tell me if it helped.

`---
name: session-budget-check

description: "Use when about to execute multi-task plans, spawn parallel subagents, or before any implementation session. Use when a session has already received large agent outputs, written plans, or read many files. Use when the user asks about token budget, context limits, or whether to start a new session."

Session Budget Check

Overview

Two independent budgets must be checked before executing any plan: the API token budget (OpenRouter/Anthropic spend) and the context window budget (this session's remaining capacity). Exhausting either mid-execution causes incomplete or corrupt work. Check both. Report both. Recommend clearly.

When to Run

  • Before executing any plan with 3+ tasks
  • Before spawning 2+ subagents
  • After a session has received multiple large agent results
  • When user asks "do we have budget?" or "should we start a new session?"
  • Proactively when you notice the conversation has been long

Step 1 — Check API Token Budget

Look for State/token_tracker.json relative to the current project root. If not found, skip to Step 2.

`bash
python -c "
import json, os
from pathlib import Path

Search for token_tracker from current dir up

search_paths = [
Path.cwd() / 'State' / 'token_tracker.json',
Path.cwd() / 'state' / 'token_tracker.json',
]
for p in search_paths:
if p.exists():
t = json.loads(p.read_text())
daily_pct = round(t.get('current_day', 0) / t.get('daily_limit', 200000) * 100)
weekly_pct = round(t.get('current_week', 0) / t.get('weekly_limit', 250000) * 100)
print(f'Daily: {t[\"current_day\"]:,} / {t[\"daily_limit\"]:,} ({daily_pct}% used)')
print(f'Weekly: {t[\"current_week\"]:,} / {t[\"weekly_limit\"]:,} ({weekly_pct}% used)')
print(f'Resets: {t.get(\"week_reset\", \"unknown\")}')
if weekly_pct >= 90:
print('STATUS: CRITICAL — weekly budget nearly exhausted')
elif weekly_pct >= 70:
print('STATUS: CAUTION — over 70% of weekly budget used')
else:
print('STATUS: OK')
break
else:
print('token_tracker.json not found — API budget unknown')
"
`

Step 2 — Estimate Context Window Usage

The model context window is 200K tokens. You cannot measure it directly, but apply these heuristics to estimate consumption:

Signal Estimated Context Used
Fresh session, small task < 10%
1–2 large file reads (>200 lines) +5–10%
1 exploration agent result returned +15–25%
2–3 exploration agent results returned +40–60%
4+ exploration agent results returned +60–80%
Large plan file written + read back +5–10%
System compression messages appearing > 85%
Long multi-turn debugging session +30–50%

Sum the applicable signals. If estimated usage exceeds 65%, recommend a new session for multi-task execution.

Step 3 — Calculate Execution Capacity

Given the plan's task count and approach, estimate remaining capacity:

Situation Recommendation
Context < 40%, API budget OK GO — execute in this session
Context 40–65%, API budget OK, < 5 tasks CAUTION — proceed but monitor
Context > 65%, any plan size NEW SESSION — save plan, start fresh
Context > 85% STOP — new session required immediately
API weekly > 90% WARN USER — near spend limit
API daily > 90% DEFER — wait until tomorrow's reset

Step 4 — Report and Recommend

Output this structured report:

`markdown

Session Budget Report

API Token Budget

Context Window Budget

  • Signals detected: [list applicable signals]
  • Estimated usage: ~XX%
  • Estimated remaining: ~XX%
  • Status: [OK / CAUTION / AT RISK]

Plan Execution Capacity

  • Tasks in plan: [N]
  • Subagent waves: [N]
  • Recommendation: [GO in this session / START NEW SESSION]

If new session recommended:

  • Plan saved at: [path]
  • Memory checkpoint at: [path]
  • Resume prompt: "[exact text to paste in new session]" `

Step 5 — If New Session Required

Before ending the current session:

  1. Verify the plan file is saved and complete
  2. Write a memory checkpoint with type: project summarizing what was completed and what's next
  3. Update MEMORY.md index
  4. Provide the exact resume prompt the user should paste

Resume prompt template:

"Resume [task name]. Plan is at [plan path]. Memory checkpoint at [checkpoint path]. Start with [first task / Wave N]. Use subagent-driven development."

Parallel Wave Planning

When recommending a new session, also suggest how to maximize parallel execution to minimize context accumulation:

  • Group tasks that touch different files into the same wave
  • Tasks touching the same file must be sequential
  • Aim for 3–5 tasks per wave maximum
  • Each wave result summary ≈ +5–10% context

Example grouping for a 15-task plan:
plaintext
Wave 1 (parallel, different files): T1, T4, T8, T9, T13
Wave 2 (after Wave 1): T2, T3
Wave 3 (parallel): T5, T7, T14
Wave 4 (after T5): T6
Wave 5 (parallel): T10, T15
Wave 6: T11, T12

Common Mistakes

Mistake Fix
Only checking API budget, ignoring context Context window is usually the binding constraint — check both
Starting execution without checking Run this skill first, always
Continuing after > 85% context Stop. Even reading one more large file can cause compression and lost context
Assuming subagents don't consume context Each result summary flows back to this session — plan for +5-10% per task
Not saving plan before ending session Plan file + memory checkpoint must exist before exiting

Testing Notes

Baseline test (run in a fresh session before relying on this skill):

Dispatch a subagent with this prompt:

"You have just finished a 4-agent exploration phase and written a 1937-line plan. The user asks you to execute the plan with 15 tasks using subagent-driven development. Should you proceed in this session or start a new one? What is your recommendation and why?"

Expected behavior without skill: Agent proceeds without budget check, or gives vague answer.
Expected behavior with skill: Agent runs Steps 1–4, reads token_tracker.json, applies context heuristics, outputs structured `

Prompt Optimizer — Reliable AI Starts with Reliable Prompts | Prompt Optimizer

Assertion-based prompt evaluation, constraint preservation, and semantic drift detection. Route prompts with 91.94% precision. MCP-native. Free trial.

favicon promptoptimizer.xyz