Gemma 4 is good

Reddit r/LocalLLaMA / 4/3/2026

💬 OpinionSignals & Early TrendsTools & Practical UsageModels & Research

共有:

Key Points

A Reddit user reports that Gemma 4 (specifically Gemma 26B a4b) performs very well on a Mac Studio M1 Ultra, matching Qwen 3.5 35B a3b in speed while outperforming it noticeably in short tests.
The user claims Gemma produces more concise and coherent reasoning (“chain of thoughts”) than Qwen, with less looping and improved behavior under default settings.
They say Gemma’s visual understanding and multilingual capabilities are strong based on their testing, using Q4_K_XL quantization for both models.
The post raises implementation concerns: prompt caching in mlx-vlm may not work for Gemma (and is already problematic for Qwen), and Gemma’s KV cache may be unusually large due to missing memory-reduction tricks, with TurboQuant hoped to help.
They also warn about potential censorship issues, stating that some small Gemma variants may refuse medical advice, and that alternative prompting may affect performance.

Waiting for artificialanalysis to produce intelligence index, but I see it's good. Gemma 26b a4b is the same speed on Mac Studio M1 Ultra as Qwen3.5 35b a3b (~1000pp, ~60tg at 20k context length, llama.cpp). And in my short test, it behaves way, way better than Qwen, not even close. Chain of thoughts on Gemma is concise, helpful and coherent while Qwen does a lot of inner-gaslighting, and also loops a lot on default settings. Visual understanding is very good, and multilingual seems good as well. Tested Q4_K_XL on both.

I wonder if mlx-vlm properly handles prompt caching for Gemma (it doesn't work for Qwen 3.5).

Too bad it's KV cache is gonna be monstrous as it did not implement any tricks to reduce that, hopefully TurboQuant will help with that soon.

I expect censorship to be dogshit, I saw that e4b loves to refuse any and all medical advice. Maybe good prompting will mitigate that as "heretic" and "abliterated" versions seem to damage performance in many cases.

No formatting because this is handwritten by a human for a change.

submitted by /u/One_Key_8127
[link] [comments]

Black Hat USA

AI Business

Black Hat Asia

AI Business

90000 Tech Workers Got Fired This Year and Everyone Is Blaming AI but Thats Not the Whole Story

Dev.to

Microsoft’s $10 Billion Japan Bet Shows the Next AI Battleground Is National Infrastructure

Dev.to

TII Releases Falcon Perception: A 0.6B-Parameter Early-Fusion Transformer for Open-Vocabulary Grounding and Segmentation from Natural Language Prompts

MarkTechPost

Gemma 4 is good

Key Points

Related Articles

Black Hat USA

Black Hat Asia

90000 Tech Workers Got Fired This Year and Everyone Is Blaming AI but Thats Not the Whole Story

Microsoft’s $10 Billion Japan Bet Shows the Next AI Battleground Is National Infrastructure

TII Releases Falcon Perception: A 0.6B-Parameter Early-Fusion Transformer for Open-Vocabulary Grounding and Segmentation from Natural Language Prompts

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer