Anthropic admits to have made hosted models more stupid, proving the importance of open weight, local models

Reddit r/LocalLLaMA / 4/24/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisIndustry & Market Moves

Key Points

  • Anthropic says it adjusted Claude Code’s default reasoning effort from “high” to “medium” to reduce long latency and UI freezes, but reverted the change after users preferred higher intelligence with lower effort for simple tasks.
  • A later update intended to clear old “thinking” from idle sessions reduced resume latency, but a bug caused the behavior to repeat every turn and made Claude seem repetitive; Anthropic fixed it.
  • Anthropic also added a system prompt to reduce verbosity, but the combination with other prompt changes degraded coding quality, leading to a rollback.
  • The changes affected multiple Claude 4.x variants (Sonnet 4.6 and Opus 4.6, plus Opus 4.7) and are framed as tradeoffs that reduced quality in ways end users may not control.
  • The takeaway highlighted by commenters is that service reliability and control may favor using open-weight models that users can host (or pay others to host) rather than relying solely on opaque hosted behavior changes.
Anthropic admits to have made hosted models more stupid, proving the importance of open weight, local models

TL;DR:

On March 4, we changed Claude Code's default reasoning effort from high to medium to reduce the very long latency—enough to make the UI appear frozen—some users were seeing in high mode. This was the wrong tradeoff. We reverted this change on April 7 after users told us they'd prefer to default to higher intelligence and opt into lower effort for simple tasks. This impacted Sonnet 4.6 and Opus 4.6.

On March 26, we shipped a change to clear Claude's older thinking from sessions that had been idle for over an hour, to reduce latency when users resumed those sessions. A bug caused this to keep happening every turn for the rest of the session instead of just once, which made Claude seem forgetful and repetitive. We fixed it on April 10. This affected Sonnet 4.6 and Opus 4.6.

On April 16, we added a system prompt instruction to reduce verbosity. In combination with other prompt changes, it hurt coding quality and was reverted on April 20. This impacted Sonnet 4.6, Opus 4.6, and Opus 4.7.

In each of these they made conscious choices to lower server load at the cost of quality, completely outside the end users control and without informing their paying customers of the changes.

For me, this proves that if you depend on an AI model for your service or to do your job, the only sane choice is to pick an open-weight model that you can host yourself, or that you can pay someone to host for you.

submitted by /u/spaceman_
[link] [comments]