AI Navigate

Agent Diagnostics Mode — A Structured Technique for Iterative Prompt Tuning

Dev.to / 3/21/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical Usage

Key Points

  • Prompts drift across model updates and changing contexts, making ad-hoc edits and lost conversation context a common friction in LLM-powered agents.
  • The article introduces Agent Diagnostics Mode as a structured loop to analyze and report findings inside the working context, instead of altering prompts mid-work.
  • A video walkthrough illustrates the approach in real time, showing how the model pivots from doing to reporting within a live project.
  • The Quick Start shows a minimal setup with a .context/agent-diagnostics-mode.md file, a sample prompt, and guidance that you should only analyze and report findings rather than modify actions.

A conceptual illustration of a technical diagnostic interface for an AI agent

The Problem with Prompt Tuning Today

Prompts are not static configuration. If you have been running LLM-powered agents on real projects for more than a few months, you already know this. A prompt that worked perfectly last quarter drifts after a model update. A system instruction that produced reliable behavior on one agent — say, Cursor — behaves differently when you port it to Gemini or Claude. And the same prompt file can produce subtly inconsistent results across projects as the surrounding context changes.

The usual response to this is ad-hoc: you notice something is off, exit the working conversation, edit the prompt file, re-run the agent, and try to reconstruct the context you had before. That friction compounds. You lose the conversational thread. You lose the intermediate reasoning the model had built up. And you are basically doing print-statement debugging on a system that has no stack trace.

The problem is not that prompt tuning is hard. The problem is that there is no structured diagnostic loop for doing it inside the working context. As for me, this lack of structure has been a recurring pain point.

Video Walkthrough: The Diagnostics In Action

If you prefer a visual deep dive, I have a video for this post. I walk through the "Westworld" analysis vibe in real-time, showing exactly how the model pivots from "doing" to "reporting" in a live project.

Quick Start: Entering and Exiting Diagnostics Mode

The mechanic is simple. I keep a small file at .context/agent-diagnostics-mode.md:

# Agent Diagnostics Mode

If user is referencing this prompt, you are in self-diagnostics mode. The goal of this mode is typically to understand your reasoning process, decision making, actions or results. Typically the user is trying to optimise your performance or behavior and tune their instructions to you.

The user may typically ask you something along with this prompt. You should analyse the user enquiry and report your findings based on the user enquiry.

**IMPORTANT:** You MUST NOT do any actions or modifications - just analyse and report your findings based on the user enquiry.

To enter diagnostics mode on any agent that supports referenced prompt files (are there any that do not?), I include this file in my message. The effect is immediate and significant: the model pivots from doing to reporting. It stops planning tasks and starts articulating its reasoning.

Here is a real example from one of my Go projects. My AGENTS.md defines two distinct task completion protocols for example: a Coding Task Completion Protocol (run make lint, run make test, confirm coverage, report status) and a Non-Coding Task Completion Protocol (summarize findings, confirm deliverables, no lint/test required). The intent is obvious: code changes require verification, everything else does not.

The model started skipping lint and tests after refactoring a package. It classified the task as non-coding — I suspect because the instruction I gave was phrased around reorganizing files, which pattern-matched to "documentation/investigation" rather than "code change." The refactor changed Go files. Lint and tests were mandatory. But the agent declared completion with just a summary.

Enter diagnostics:

@agent-diagnostics-mode.md

I just asked you to move a package and update its imports across the codebase. You reported completion without running make lint or make test. Which task completion protocol did you apply, and what in my instructions led you to that classification?

The agent responds with a behavioral report — not an action. It identifies that it applied the Non-Coding protocol, cites the exact section it matched against ("investigation, documentation review, committing"), and notes that the ambiguity came from my use of the word "reorganize" — a word it associated with structural/documentation work rather than code modification. No files are changed. No commands are run.

Exit diagnostics:

Exit diagnostics mode. Now update AGENTS.md so that the Coding Task Completion Protocol explicitly lists package moves and import updates as coding tasks, regardless of how the instruction is phrased.

The model carries the diagnostic conclusion forward and applies the fix in one step. The conversation context is the working memory — no reconstruction needed.

No special tooling is required here. This is a pure prompt technique.

Side note: every time I type @agent-diagnostics-mode.md I think of Westworld — where the hosts switched to "Analysis" mode on a simple voice command. Same vibe, but instead of uncovering suppressed memories, you are uncovering why your agent skipped make test. Feels like the future we're living in — there were no flying cars in that movie, so I guess we're almost there. :)

Why This Works: The Architecture Behind the Mode

Why LLMs Respond to Mode Declarations

LLMs are sensitive to framing. This is not a bug — it is how the attention mechanism distributes probability mass across the context window. When you explicitly declare a behavioral mode by referencing a named instruction file, you are shifting the model's prior distribution over likely next tokens. The behavioral envelope changes without any retraining.

The critical constraint in the diagnostics prompt is MUST NOT do any actions or modifications. This is not politeness. It is load-bearing. Without it, many models — especially those optimized heavily for task completion — will collapse the investigation phase into an implementation. They will diagnose the issue and then helpfully fix it before you have had a chance to evaluate whether the diagnosis was correct. That is exactly the failure mode you are trying to avoid.

By separating the diagnostic phase from the implementation phase, you get something closer to a red team exercise than a development sprint. The model is not trying to be useful in the conventional sense. It is trying to be accurate.

Probing for Behavior vs. Triggering It

There is a meaningful distinction between asking an LLM to do something and asking it to explain what it would do and why. Diagnostics mode exploits that gap. You can surface hidden assumptions, expose conflicting instructions, and simulate edge cases — all without mutating any project state.

As for me, the most valuable use has been catching prompt assumptions I wrote months ago and forgot about. The model cites them back and precisely explains why it chose them.

Prompt Tuning as a Feedback Loop

The full workflow is a tight loop: enter diagnostics → interrogate → identify the misalignment → exit → apply change. Engineers will recognize the shape — it is the REPL, or unit test isolation. The difference is that the conversation context itself is the working memory, your prompts are like code files, and you're using the model as a debugger of its own behavior.

Edge Cases and Failure Modes

Mode bleed. Some models ignore the declaration and keep taking actions. Add an explicit confirmation step — "Confirm you are in diagnostics mode before proceeding" — which forces the model to process the constraint before it does anything else.

Context window exhaustion. Long diagnostic threads are expensive. If responses start sounding vague or recycled, the effective context is likely saturated. Keep sessions focused on one prompt file at a time; open a fresh context for distinct concerns.

Model-specific variance. Instruction-following models — Claude Sonnet, GPTs — respect the MUST NOT constraint more or less reliably. Some other models I've used may not. Also, some models tend to produce quite verbose diagnostics output which may be hard to analyze. I've seen this behavior with grok-code-fast.

Further Reading

Prompt Engineering and Instruction Design

  • Anthropic: Prompt Engineering Overview — A practical reference for constructing reliable system prompts; directly relevant to writing the kind of mode-declaration instructions used in this technique.
  • OpenAI: Prompt Engineering Guide — Covers the mechanics of how instruction phrasing affects model behavior, which explains why word choice like "reorganize" causes protocol misclassification.

LLM Attention and Context Behavior

Agent Instruction Patterns

  • golang-backend-boilerplate AGENTS.md — A real-world example of the task completion instruction file referenced in this post; useful as a template for structuring your own agent rules.