Are i-Quants overrated?

We all know modern "intelligent" Quantization that uses an imatrix to make a Q4_K_XL model to feel like Q6_K.

But here is what i notice: While this works well on most English tasks, the effect can be reversed on other languages or niche tasks.

The reason is quite simple and you will find out quickly when you look in the imatrix-file: You find 80% English here with mostly basic tasks and some code. Few imatrix files are thoughtful engineering work.

That's why I mostly use classic Q4_K_M again these days.

There's one exception, of course:
When you go all the way down to Q1 or Q2, even a poor imatrix is better than no calibration at all, because the air gets very thin here and the models are usually only usable in English anyway.

What do you guys think? Similar or different experience?

submitted by /u/PromptInjection_
[link] [comments]

Are i-Quants overrated?

Key Points

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer