AI Navigate

How bad is 1-bit quantization but on a big model?

Reddit r/LocalLLaMA / 3/11/2026

📰 NewsTools & Practical Usage

Key Points

  • The user is interested in running the Qwen3.5-397B-A17B model and noticed the IQ1_S and IQ1_M quantized versions are significantly smaller in size.
  • They are questioning the performance degradation or quality loss due to 1-bit quantization on such a large model compared to the original full-precision model.
  • They are also curious whether these 1-bit quantized models are comparable in effectiveness to smaller Qwen models like the 122B or 35B parameter versions.
  • The inquiry relates to practical considerations for deploying very large language models with quantization techniques that reduce model size and resource requirements.
  • This reflects ongoing community interest in balancing model size, quantization, and inference quality for large-scale AI models.

I'm planning on running Qwen3.5-397B-A17B then saw that the IQ1_S and IQ1_M have quite small size, how bad are they compared to the original and are they comparable to like Gwen3.5 122B or 35B?

submitted by /u/FusionBetween
[link] [comments]