Your AI generated code is "almost right", and that is actually WORSE than it being "wrong".

Dev.to / 3/21/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical Usage

共有:

Key Points

AI-generated code that is 'almost right' can pass reviews, tests, and linters and still introduce hidden risks into a codebase.
AI tools can enhance work, but practitioners should slow down, practice, and maintain a default stance of scrutiny and skepticism, with trust earned rather than granted.
Even with robust data and well-architected prompts, accuracy hinges on nuance and details that can cause breaking changes, so a 'trust, but verify' approach is essential.
The piece cites Anthropic's guidance and an anecdote about a security engineer to illustrate the need for human judgment and ongoing coding and debugging practice to vet AI suggestions.

"Almost right" will make it past reviews.
"Almost right" will pass tests and linters
"Almost right" will make it in your codebase, and wait for the right mix of reasons to create a potential catastrophe.

Yes, AI tools enhance your work and empower you to "punch above your weight". You also need discipline and practice, and you can give yourself permission to slow down and learn what is happening on a deeper level.

While the industry is pushing relentlessly for handing over control to “agents,” I propose a more measured approach, and recommend that the default mode when working with LLMs should always be scrutiny and skepticism. The trust needs to be earned, not granted.

When working in areas where the training data is robust and plentiful and the requests are clearly architected with proper context, they have a fairly high accuracy rate. Nevertheless, the real work happens in the nuance and the details, and they are renowned for introducing application-breaking issues through seemingly innocuous additions or changes. Every response should have a “trust, but verify” approach.

Anthropic themselves support this approach:

A security engineer highlighted the importance of experience when Claude proposed a solution that was “really smart in the dangerous way, the kind of thing a very talented junior engineer might propose.” That is, it was something that could only be recognized as problematic by users with judgment and experience.

It's only by knowing how to code, and practicing your coding on a regular basis (and your debugging, which I'm starting a course on), will you learn the skills to be able to catch those "almost right" solutions that these models provide, and be able to vet them properly, and ensure you're not pushing up a time bomb to your repo! 💣