I have a modest rig that allows me to run Qwen 3.5 27B or even 35B via Ollama. Qwen has been amazing to work with and I've been fine with the slow drip trade-off.
Then Google released Gemma4.
Its fast - like 4 or 9B fast. Accuracy and confidence wise, reminds me of that first release of Gemini Pro that could actually produce code that would run.
As a "local guy" this shift in useability and confidence for a small self hosted LLM reminded me of what Deepseek brought to the table years ago with the thinking capability.
Give it a go when you have a chance, and apply the settings that google recommends, it does make a difference (slightly slower but better)
I tried a few releases and this one worked the best for all the tests I threw at it with law interpretation, python, brainstorming & problem solving.
bjoernb/gemma4-26b-fast:latest (not affiliated with whoever made this)
in the next few days I'll start checking the abliterated versions to see how they stand with pentest & sysec tasks vs Qwen
[link] [comments]




