Gemma 4 vs Qwen3.5: benchmarking quantized local LLMs on Go coding

Reddit r/LocalLLaMA / 4/11/2026

💬 OpinionSignals & Early TrendsTools & Practical UsageModels & Research

共有:

Key Points

The post compares local coding performance of two quantized mixture-of-experts (MoE) LLMs—Gemma 4 and Qwen 3.5—specifically under constrained laptop hardware conditions.
It focuses on benchmarking models under 40B parameters to fit limited memory bandwidth and compute limits.
The author notes that an additional baseline, GPT-OSS-20B, performed surprisingly well despite the same local-usage constraints.
The overall theme is practical evaluation of quantized local LLMs for Go coding workflows rather than large-scale deployment.
Results are presented as an ongoing hands-on experiment for improving how developers can run and test smaller local models for real coding tasks.

I'm continuing to play around with local llms on my framework13 laptop.

So, limited memory bandwith and processing power means exploring MoE quantized models below 40B params.

surprisingly for me gpt-oss-20B did pretty well..

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

AI Business

AI Business

Dev.to

Dev.to

Dev.to