For anyone interested in building their own GGUF quants, I’ve put together the GGUF-Tool-Suite docs and a simple web UI to make the process easier.
- Docs: https://github.com/Thireus/GGUF-Tool-Suite/tree/main/docs
- Web UI: https://gguf.thireus.com/quant_assign.html
The goal is to let anyone benchmark and automatically produce GGUFs of any size for ik_llama.cpp and llama.cpp, either through the web UI or the CLI.
The tool suite has already been adopted by a few passionate users looking for better GGUF quality and more flexibility to fit hardware optimally. It has also been validated to produce higher-quality GGUFs than other popular releases in my testing, especially when using ik_llama.cpp recipes.
Kimi-K2.5 and GLM-5.1 benchmarking is coming soon, but the tool already works with quite a few models that have already been benchmarked.
[link] [comments]




