Gemma 4 26b is the perfect all around local model and I'm surprised how well it does.

Reddit r/LocalLLaMA / 4/5/2026

💬 OpinionSignals & Early TrendsTools & Practical Usage

共有:

Key Points

A user testing local LLMs on a 64GB RAM Mac found that Gemma 4 26B was fast and consistently able to generate a working Doom-style raycaster in HTML/JavaScript after only a few prompts.
Compared with Qwen 3 Coder (especially 4-bit), the user reports Gemma 4 was less likely to overload the system, less prone to tool-call parameter loops, and more reliable at completing the task.
The user also reports that Qwen 3.5 (around 30B MoE) failed to complete the same test, getting stuck in extended “thinking” loops and repeatedly rewriting the same file without finishing.
The overall takeaway is a strong impression that Gemma 4 26B is a practical “all-around” local model for coding workflows, suggesting improving competitiveness of local models over time.
The post is framed as experiential/benchmark-by-task rather than a formal evaluation, focusing on responsiveness, stability, and code completion on consumer hardware.

I got a 64gb memory mac about a month ago and I've been trying to find a model that is reasonably quick, decently good at coding, and doesn't overload my system. My test I've been running is having it create a doom style raycaster in html and js

I've been told qwen 3 coder next was the king, and while its good, the 4bit variant always put my system near the edge. Also I don't know if it was because it was the 4bit variant, but it always would miss tool uses and get stuck in a loop guessing the right params. In the doom test it would usually get it and make something decent, but not after getting stuck in a loop of bad tool calls for a while.

Qwen 3.5 (the near 30b moe variant) could never do it in my experience. It always got stuck on a thinking loop and then would become so unsure of itself it would just end up rewriting the same file over and over and never finish.

But gemma 4 just crushed it, making something working after only 3 prompts. It was very fast too. It also limited its thinking and didn't get too lost in details, it just did it. It's the first time I've ran a local model and been actually surprised that it worked great, without any weirdness.

It makes me excited about the future of local models, and I wouldn't be surprised if in 2-3 years we'll be able to use very capable local models that can compete with the sonnets of the world.

submitted by /u/pizzaisprettyneato
[link] [comments]

Black Hat USA

AI Business

Black Hat Asia

AI Business

Who is Xu Rui, the ex-ByteDance executive tapped by Meta to lead AI hardware?

SCMP Tech

I Built a Voice AI with Sub-500ms Latency. Here's the Echo Cancellation Problem Nobody Talks About

Dev.to

How I Found $1,240/Month in Wasted LLM API Costs (And Built a Tool to Find Yours)

Dev.to

Gemma 4 26b is the perfect all around local model and I'm surprised how well it does.

Key Points

Related Articles

Black Hat USA

Black Hat Asia

Who is Xu Rui, the ex-ByteDance executive tapped by Meta to lead AI hardware?

I Built a Voice AI with Sub-500ms Latency. Here's the Echo Cancellation Problem Nobody Talks About

How I Found $1,240/Month in Wasted LLM API Costs (And Built a Tool to Find Yours)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer