So in response to the Great Token Reconning of 2026, I decided to try out Qwen 3.6 as a daily driver, and although it's only been about a day, I have to say I'm thoroughly impressed.
I had to download the VSCode insiders edition and set up the local models to support - super easy. Then I messed around with Gemma 4 and Qwen 3.6 (served with LM Studio) while performing typical tasks as I build out an app that does a lot of data mining and web scraping.
After trying out all the versions of the two models with the different quants, there is a clear winner: Qwen-3.6-27B-q8_k_xl by Unsloth.
I AM SO IMPRESSED! The token generation can be a tad bit slow, but the truth is, I was seeing long delays even when I was using Github Copilot hosted models. It felt about the same speed wise overall, maybe a touch slower than hosted. But whats impressive is with appropriate tool calling this little dense model can handle its own just fine.
To be clear, I dont think this it can work at the feature level like Opus 4.6 could. You cant just say "Hey implement this feature" - vibe coders and non-coders wont survive with this most likely. There were a few times where I had to steer it to improve it's code quality and approach, but functionally it was nailing it.
If you always do a Plan round first and really work out all the details, then it will get there, and then implement it without issue. If you have a decent grasp of systems architecture this is perfectly hitting that "good enough" status for a local model. I have been plugging away all day and havent used a single API token.
Now I need another RTX6000 so I'm not fighting with my agents for compute 😝
[link] [comments]


