unsloth - MiniMax-M2.7-GGUF in BROKEN (UD-Q4_K_XL) --> avoid usage

Reddit r/LocalLLaMA / 4/13/2026

💬 OpinionSignals & Early TrendsTools & Practical Usage

Key Points

  • Reddit投稿者は、unslothが公開した「MiniMax-M2.7-GGUF(UD-Q4_K_XL)」がPPL測定で壊れている(broken)可能性が高いと主張している。
  • 投稿者は、NaNを示すような数値的問題は量子化やバックエンドカーネルの“急ぎで未検証の誤り”を示唆するため、公開前に検証すべきだと批判している。
  • 同等モデルの他HFプロバイダ(aessedai/MiniMax-M2.7-Q5_K_M、ubergarm/MiniMax-M2.7-IQ5_K)では同種のエラーが見られなかったと比較している。
  • 「--validate-quants」等で検証する手順や、GGUF quantingコミュニティで受け入れられているPPL/KLDの透明な提示を求めている。
  • 投稿者はunslothの“poisoned CUDA”のような話題よりも、QA(品質保証)と公開前チェックの不足が問題だと位置づけている。
unsloth - MiniMax-M2.7-GGUF in BROKEN (UD-Q4_K_XL) --> avoid usage

I am already tired of this (unsloth and others) approach of "let's be the first cause we know we have people starving for new models" while otherwise never caring to prove - like most of the other quants creators - if their quants are any good like checking PPL for catastrophic faults like "NaN" and/or measure and publish PPL and KLD figures.

Latest proof of this rush is their "UD-Q4_K_XL" of MiniMax-M2.7-GGUF where a simple PPL measuring shows the model to be broken.

For the people asking what is "NaN" in quant PPL measurement that would normally point out the existence of numerical issues with the backend kernels or the quant itself, it's about a rushed in / never checked quant error.

I have checked similar quants from other HF providers (aessedai/MiniMax-M2.7-Q5_K_M --> 157.226 GiB (5.906 BPW) and ubergarm/MiniMax-M2.7-IQ5_K --> 157.771 GiB (5.926 BPW)) and no such error is present

But this is not about backend kernels, nor about unsloth much-hyped "poisoned CUDA 13.2".

There are ways to avoid these before publishing quants in a rush (like "--validate-quants" to check and show you if you've got "0" blocks in your quant)

Please Unsloth, get in line with QA and abide by the already accepted "GGUF quanting community" on HF and transparently provide PPL and KLD data. At least do it internally as a hygene measure to avoid such flops. Rush it not!

~/llms/llama.cpp/build/bin/llama-perplexity -m ~/models/gguf/unsloth/MiniMax-M2.7-UD-Q4_K_XL/MiniMax-M2.7-UD-Q4_K_XL-00001-of-00004.gguf -f ~/models/wikitext-2-raw/wiki.test.raw -fa 1 -ctk f16 -c 512 -ngl 99 -b 512 -ub 512 --seed 1337 --chunks 250

https://preview.redd.it/aibi9wexnxug1.png?width=2553&format=png&auto=webp&s=fa33c0dca73a7903857c04329d1b009050e0fe6f

VS

~/llms/llama.cpp/build/bin/llama-perplexity -m ~/workbench/aessedai/MiniMax-M2.7-Q5_K_M/MiniMax-M2.7-Q5_K_M-00001-of-00005.gguf -f ~/models/wikitext-2-raw/wiki.test.raw -fa 1 -ctk f16 -c 512 -ngl 99 -b 512 -ub 512 --seed 1337 --chunks 250

https://preview.redd.it/r8uw2kj6oxug1.png?width=2553&format=png&auto=webp&s=cb3a88d929272b48f702f8831592bb4b9db9b767

submitted by /u/One-Macaron6752
[link] [comments]