AI Navigate

インサイト最新記事一覧 AI大全

It looks like we’ll need to download the new Gemma 4 GGUFs

Reddit r/LocalLLaMA / 4/8/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Read original →

共有:

Key Points

The Gemma 4 GGUF files shared on Hugging Face (including E2B-it and 26B variants) were updated and users may need to download the newest versions.
The updates address multiple compatibility and correctness issues tied to llama.cpp, including kv-cache support for attention rotation with heterogeneous iSWA.
A critical CUDA-related fix checks for buffer overlap before fusing to prevent issues involving unused tokens.
Several tokenizer and conversion improvements were added for Gemma 4, such as byte token handling in the BPE detokenizer and setting "add bos" for conversions.
Additional parsing and model-format adjustments were made, including Gemma 4-specific parser changes, reading final_logit_softcapping, and custom newline splitting.

https://huggingface.co/unsloth/gemma-4-E2B-it-GGUF

https://huggingface.co/unsloth/gemma-4-26B-A4B-it-GGUF

by u/danielhanchen:

We just updated them again in response to:

kv-cache : support attention rotation for heterogeneous iSWA https://github.com/ggml-org/llama.cpp/pull/21513
CUDA: check for buffer overlap before fusing - CRITICAL fixes <unused24> tokens https://github.com/ggml-org/llama.cpp/pull/21566
vocab : add byte token handling to BPE detokenizer for Gemma4 https://github.com/ggml-org/llama.cpp/pull/21488
convert : set "add bos" == True for Gemma 4 https://github.com/ggml-org/llama.cpp/pull/21500
common : add gemma 4 specialized parser https://github.com/ggml-org/llama.cpp/pull/21418
llama-model: read final_logit_softcapping for Gemma 4 https://github.com/ggml-org/llama.cpp/pull/21390
llama: add custom newline split for Gemma 4 https://github.com/ggml-org/llama.cpp/pull/21406

submitted by /u/jacek2023
[link] [comments]

Related Articles

Black Hat USA

Black Hat USA

AI Business

Black Hat Asia

Black Hat Asia

AI Business

[N] Just found out that Milla Jovovich is a dev, invested in AI, and just open sourced a project

Reddit r/MachineLearning

Context Windows Are Getting Absurd — And That's a Good Thing

Context Windows Are Getting Absurd — And That's a Good Thing

Dev.to

Every AI Agent Registry in 2026, Compared

Every AI Agent Registry in 2026, Compared

Dev.to

関連おすすめサービス

※当サイトはアフィリエイト広告を利用しています

Notta搭載AI議事録イヤホン ZENCHORD1

AI時代の仕事術。Notta搭載で会議の議事録を自動生成するスマートイヤホン。

AI搭載ボイスレコーダー Plaud

世界100万人が愛用。AIで文字起こし・要約を自動化するボイスレコーダー。

画像高画質化AIツール Aiarty Image Enhancer

AIで画像を高画質化。写真・イラストを簡単にアップスケール。