Major drop in intelligence across most major models.

Reddit r/LocalLLaMA / 4/15/2026

💬 OpinionSignals & Early TrendsModels & Research

Key Points

  • A Reddit user reports that, as of mid-April 2026, multiple major LLM services (including Claude, Gemini, z.ai, and Grok) appear to have degraded in reasoning ability, instruction-following, latency, and response depth.
  • The post claims the models behave like they are in a “grumpy” mode—ignoring basic instructions, struggling with simple tasks, and producing shortened, shallower outputs.
  • To test the issue, the user compares GLM 5 performance on a rented H100 GPU versus the same model accessed via z.ai, stating the H100-hosted instance answered correctly while z.ai did not.
  • The user speculates that model quality may have been reduced operationally (e.g., lowered quantization to around Q2) or otherwise changed behind the scenes.
  • The post suggests workarounds such as running locally, renting GPUs, or using services that allow selecting quantization levels to regain output quality.

As of mid Apr 2026, I have noticed every model has had a major intelligence drop.

And no I'm not talking about just ChatGPT.

Everything from Claude(Even Sonnet along with Opus), Gemini, z.ai, Grok all seem to ignore basic instructions, struggle at simple tasks, take very long to respond, and the output seems deliberately shortened and very shallow. Almost like it's in a "grumpy" mode. I tried this in incognito mode so it's not my customization or memory influencing this.

It's like they deliberately want you to stop using their service. I guess our data is no longer needed. Just two weeks back it used to be much smarter than this.

To test this I rented out a H100, and tried GLM 5 with the same prompt (the drive to the car wash one) across both instances. GLM5 running on the rented GPU answered it correctly, compared to the one on z.ai.

Have they lowered the quantization really low to maybe Q2?

I guess going local or using renting GPU or an AI monthly service that lets you pick a quant level is the way to go

submitted by /u/DepressedDrift
[link] [comments]