AI Navigate

インサイトインサイト最新記事最新記事一覧 AI大全AI大全カオスマップAIカオスマップ

llama.cpp's Preliminary SM120 Native NVFP4 MMQ Is Merged

Reddit r/LocalLLaMA / 4/29/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research

Read original →

共有:

Key Points

llama.cpp has merged a “preliminary” implementation of SM120 native NVFP4 MMQ, enabling a new low-level optimization path for supported NVIDIA GPUs.
The change is introduced via a specific upstream pull request (PR #22196) in the llama.cpp repository.
Early availability of GGUF model files compatible with NVFP4 appears to have already emerged on Hugging Face, including Gemma-4 and Nemotron variants.
This suggests rapid community follow-through on the new quantization/format support, which may accelerate local inference experimentation.
While labeled preliminary, the merge indicates active development momentum toward broader GPU-native performance features in llama.cpp.

https://github.com/ggml-org/llama.cpp/pull/22196

And somehow we already got some GGUFs for it!

https://huggingface.co/CISCai/gemma-4-31B-it-NVFP4-turbo-GGUF

https://huggingface.co/stevelikesrhino/gemma-4-31B-it-nvfp4-GGUF

(the below one is from PR author himself)

https://huggingface.co/michaelw9999/Nemotron-Cascade-2-30B-A3B-NVFP4-GGUF

https://huggingface.co/valikk123/Qwen3.5-35B-A3B-NVFP4-GGUF

submitted by /u/ggonavyy
[link] [comments]

Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Everyone Wants AI Agents. Fewer Teams Are Ready for the Messy Business Context Behind Them

Everyone Wants AI Agents. Fewer Teams Are Ready for the Messy Business Context Behind Them

Dev.to

Free Registration & $20K Prize Pool: 2nd MLC-SLM Challenge 2026 on Multilingual Speech LLMs [N]

Reddit r/MachineLearning

How to Build Traceable and Evaluated LLM Workflows Using Promptflow, Prompty, and OpenAI

MarkTechPost

AI 编程工具对比 2026：Claude Code vs Cursor vs Gemini CLI vs Codex

Dev.to

関連おすすめサービス

※当サイトはアフィリエイト広告を利用しています

Notta搭載AI議事録イヤホン ZENCHORD1

AI時代の仕事術。Notta搭載で会議の議事録を自動生成するスマートイヤホン。

AI搭載ボイスレコーダー Plaud

世界100万人が愛用。AIで文字起こし・要約を自動化するボイスレコーダー。

画像高画質化AIツール Aiarty Image Enhancer

AIで画像を高画質化。写真・イラストを簡単にアップスケール。