FOR ME, Qwen3.5-27B is better than Gemini 3.1 Pro and GPT-5.3 Codex

Reddit r/LocalLLaMA / 4/1/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The author argues that Qwen3.5-27B performs better than Gemini 3.1 Pro and GPT-5.3 Codex for coding tasks because it fails fast instead of persisting with wrong or risky actions.
They claim some proprietary “autonomous” coding models can go off the rails (e.g., repeatedly attempting unsafe Perl scripts or switching to unrelated scripting approaches) when encountering errors or permission issues.
The author describes a workflow where watching closely still may not prevent wasted time, since agent tunnel vision can be hard to detect.
They prefer models that stop and report inability to write to a file rather than trying additional unrequested “workarounds.”
They end with a call to research labs to build more models that behave this way—conservative, user-aligned, and less likely to hallucinate escalating solutions.

There's something I hate about the big SOTA proprietary models. In order to make them better for people who don't know how to program, they're optimized to solve problems entirely autonomously. Yeah, this makes people over on /r/ChatGPT soypog when it writes a 7z parser in Python because the binary is missing, however, for me, this makes them suck. If something isn't matching up, Qwen3.5-27B will just give up. If you're trying to vibecode some slop this is annoying, but for me this is much, much better. I'm forced to use GitHub Copilot in university, and whenever there's a problem, it goes completely off the rails and does some absolute hogwash. Like, for example, it was struggling to write to a file that had some broken permissions (my fault) and it kept failing. I watched as Claude began trying to write unrestricted, dangerous Perl scripts to forceably solve the issue. I created a fresh session and tried GPT-5.3 Codex and it did literally the exact same thing with the Perl scripts. Even when I told it to stop writing Perl scripts, it just started writing NodeJS scripts. The problem is that it isn't always obvious when your agent is going off the rails and tunnel visioning on nonsense. So, even if you're watching closely, you could still be wasting a ton of time. Meanwhile, if some bullshit happens, Qwen3.5 doesn't even try, it just gives up and tells me it couldn't write to the file for some reason.

Please, research labs, this is what I want, more of this please.

submitted by /u/EffectiveCeilingFan
[link] [comments]