Layman's comparison on Qwen3.6 35b-a3b and Gemma4 26b-a4b-it

Reddit r/LocalLLaMA / 4/21/2026

💬 OpinionSignals & Early TrendsTools & Practical Usage

Key Points

  • The post offers a layman-style comparison of two LLMs—Qwen3.6-35b-a3b and Gemma4 26b-a4b-it—describing Gemma 4 as reliable and Qwen3.6 as more “top-performing” with extra expressive capacity.
  • The author reports that, on a 16GB VRAM GPU and using Windows LM Studio with recommended inference settings, both models run at comparable speed.
  • Specific quantized model variants are referenced (unsloth/gemma-4-26B-A4B-it-UD-Q4_K_S and AesSedai/Qwen3.6-35B-A3B IQ4_XS) to ground the performance comparison.
  • An edit notes that the author was using Gemma 4 incorrectly at first, and that improved results came after applying a better system prompt from another commenter.
  • The author invites readers to point out any strong disagreements, framing the post as community-driven evaluation rather than a definitive benchmark.
Layman's comparison on Qwen3.6 35b-a3b and Gemma4 26b-a4b-it

Gemma 4 26b-a4b-it is basically a solid B student that gets the job done.

Qwen3.6-35b-a3b is an A+ student that has plenty of energy after finishing the assignment to add flairs.

On a my 16vram video card. Both models runs comparable speed. On Windows LM Studio using recommended inference settings. Model used:

unsloth/gemma-4-26B-A4B-it-UD-Q4_K_S

AesSedai/Qwen3.6-35B-A3B IQ4_XS

Any strong disagreements?

Edit: Apparently I've been using Gemma 4 wrong. Sadman782's comment and his system prompt really help unlock some of Gemma 4's potential!

submitted by /u/LocalAI_Amateur
[link] [comments]