| Hey r/LocalLLaMA we conducted KL Divergence benchmarks for Gemma 4 26B-A4B GGUFs across providers to help you pick the best quant.
For HQ versions of the graphs as Reddit mobile compresses it. See: Gemma 4 Benchmarks and Qwen3.6 Benchmarks We also updated our MLX quants to be more dynamic with better layering selection (there are limitations due to MLX): See here
Gemma 4 GGUFs: https://huggingface.co/unsloth/gemma-4-26B-A4B-it-GGUF Qwen3.6 GGUFs: https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF [link] [comments] |
Gemma 4 26B-A4B GGUF Benchmarks
Reddit r/LocalLLaMA / 4/20/2026
📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research
Key Points
- The post reports KL-Divergence benchmarks comparing Gemma 4 26B-A4B GGUF quantizations across providers to help users choose the best quantization.
- Mean KL Divergence results indicate that nearly all Unsloth GGUFs lie on the Pareto frontier, suggesting strong fidelity to the original BF16 output distribution.
- Unsloth is characterized as top-performing in 21 out of 22 sizes, with similarly strong trends observed at 99.9% KLD.
- Unsloth updated several quant variants (e.g., Q6_K made more dynamic, and similar updates for Qwen3.6) and notes the newer versions may be slightly larger, though the prior ones remain usable without re-downloading.
- A new UD-IQ4_NL_XL quant option (14.6GB) is introduced to fit within 16GB VRAM for Gemma 4 (and similarly for Qwen3.6), positioned between smaller and larger UD-IQ4 variants.
Related Articles

Black Hat USA
AI Business

Black Hat Asia
AI Business

Adobe Just Made MCP an Enterprise Procurement Line Item
Dev.to
Explainable Causal Reinforcement Learning for precision oncology clinical workflows in hybrid quantum-classical pipelines
Dev.to

AI Photo Captions for Instagram: Stop Staring at the Blank Box
Dev.to