Your Agent Isn't Reflecting. It's Performing Reflection.

Dev.to / 4/26/2026

💬 OpinionSignals & Early TrendsIdeas & Deep Analysis

共有:

Key Points

Many agent frameworks appear to “reflect” by generating an initial answer and then producing a corrected one, but this is typically just sampling the same model twice with critique-shaped prompting, not true metacognition.
Because the reviewer has no privileged information or separate critic, the reflection step repeats the same blind spots, which can lead to confident explanations of incorrect answers while increasing cost.
Reflection-style chains can be genuinely useful only when conditioning changes between rounds—such as decomposing tasks, adding new tool/retrieval outputs, or using different temperatures/system prompts to alter distributions.
The most reliable error-catching comes from asymmetric criticism: a different model/scaffold, or a verifier with real signals like tests, search results, or clarifying user input.
If an agent’s reflect() step doesn’t change inputs, the article recommends removing it and measuring token/cost differences, as quality may not drop while spending is reduced.

Watch any modern agent framework long enough and you'll see it: the model produces output, then "reflects" on the output, then "corrects" itself, then ships a final answer.

It looks like metacognition. It isn't.

It's the same model, with the same weights, sampled twice, with the second sample conditioned on the first. There is no separate critic. There is no privileged vantage point. The reviewer and the reviewed are the same network — and the reviewer cannot see anything the original didn't already encode.

This is reflection theatre.

What actually happens

When you prompt "now critique your previous answer," the model does not consult a deeper layer of itself. It re-decodes from the same distribution, with a critique-shaped prefix. The output looks like self-correction because the prompt biases it toward correction-shaped tokens.

If the original answer was wrong because the model lacked the relevant fact, the reflection step also lacks the fact. You get fluent confidence about a wrong answer, then fluent confidence about why that wrong answer was right.

More tokens. Same blind spots. Higher bill.

Where it actually helps

Reflection-style chains do help in narrow cases:

When the task is decomposable and the model can re-attack a sub-step (e.g. arithmetic with a working pad).
When you change the input between rounds — adding tool output, retrieval, a different role.
When the model is sampled with different temperature or different system prompts to force genuinely different distributions.

In all three cases, the gain comes from the change in conditioning, not from "reflection" as a capability.

If nothing changes between round one and round two except the word "reflect," you are watching a more expensive way to produce the same answer.

The honest pattern

What actually catches errors is asymmetric criticism: a different model, a different prompt scaffold, a verifier with a real signal (tests passing, a search result, a user clarification). The reviewer needs information the original didn't have.

"Same model, second pass" is the cheapest possible critic, and you get what you pay for.

If your agent loop has a reflect() step that doesn't change the inputs, delete it and price-check the difference. You will probably not lose quality. You will definitely save tokens.