Are there any coding benchmarks for quantized models?

Reddit r/LocalLLaMA / 4/8/2026

💬 OpinionSignals & Early TrendsModels & Research

Key Points

  • The author asks whether there are credible, up-to-date coding benchmarks specifically for quantized (low-bit) LLMs used with coding agents.
  • They report that while recent dynamic quantization can improve speed, different quantization methods/levels can cause inconsistent or “odd” coding/agentic behavior across models.
  • They want leaderboard-style evaluation on common coding benchmarks (e.g., SWE-Bench family, LiveCodeBench V6) for quantized models rather than relying on general benchmarks like KDE, Perplexity, or MMLU.
  • They note that accessible alternatives like HumanEval are less suitable because it is open-loop and not truly agentic, reinforcing the need for benchmark setups that reflect agent behavior.
  • The post claims they have only found outdated or incomplete benchmark data, suggesting a gap in the community’s measurement and reporting for quantized coding performance.

I tinker a lot with local LLMs and coding agents using them. Some models that I want to use are either too big to run on my HW (I'm looking at you MiniMax-M2.5) or too slow to be practical (<50 tok/s is painful), so I'm picking low-bit quants. Recent dynamic quants seems to perform rather well and could be fast, but sometimes I see odd behaviour when I get them to code. It seems different models at different quantization methods and levels get their agentic coding abilities affected differently.

It would be great to see some kind of leaderboard for major coding benchmarks (SWE-Bench family, LiveCodeBench V6, that sort of things), not just KDE and Perplexity and MMLU. I'd even take HumanEval, albeit begrudgingly as it's open loop, not agentic.

All I could find (and I also did ask ChatGPT to do Deep Research for me FWIW) are some outdated and patchy numbers. Surely lots of people are scratching their heads with the same question as I, so why isn't there a leaderboard for quants?

submitted by /u/mr_il
[link] [comments]