Just tested Gemma 4 2B locally on old rtx2060 6GB VRAM and used Qwen3.5 in all sizes intensively, in customer projects before.
First impression from Gemma 4 2B: It's better, faster, uses less memory than q3.5 2B. More agentic, better mermaid charts, better chat output, better structured output.
It seems like either q3.5 are benchmaxed (although they really were much better than the competition) or google is playing it down. Gemma 4 2B "seems" / "feels" more like Q3.5 9B to me.
[link] [comments]



