| Will do more, but here's a start, as you're chosing your models. Remember, USE-CASE is important:
As more come online I will add more to the graph. The more you know, the right quant for you, you grab the first time!! [link] [comments] |
Qwen3.6-27B KLDs - INTs and NVFPs
Reddit r/LocalLLaMA / 4/23/2026
💬 OpinionDeveloper Stack & InfrastructureTools & Practical Usage
Key Points
- The post shares initial guidance for selecting quantization settings (“KLDs”) for the Qwen3.6-27B model, emphasizing that the right choice depends heavily on the intended use case.
- It highlights that the THoTD NVFP variant is larger because it uses an NVFP4A16 configuration versus NVFP4(A4), and suggests NVFP4(A4) may perform better under batching since it stays in 4-bit throughout.
- It notes a significant size jump for Cyan when moving from INT4 to BF16-INT4, raising a trade-off question between mixed-precision accuracy gains and increased memory/context cost.
- The author indicates they will add more data to the graph as additional variants become available, encouraging readers to pick the correct quant the first time.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.


