Claude cannot be trusted to perform complex engineering tasks

Reddit r/artificial / 4/13/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisIndustry & Market Moves

共有:

Key Points

AMD’s AI director analyzed 6,852 Claude Code sessions and concluded that Claude “cannot be trusted” for complex engineering work due to reduced reasoning (“thinking depth dropped 67%”), weaker code-reading behavior, and editing actions after not reading files.
Reported violations increased (e.g., stop-hook violations from zero to about 10 per day), and Anthropic allegedly changed defaults from “high” to “medium” effort while adding “adaptive thinking” that can result in zero reasoning tokens on some turns.
Anthropic reportedly confirmed via shared transcripts that some turns allocated zero thinking tokens, and those same turns were associated with hallucinations.
AMD’s team said they already switched to another provider after the silent update, highlighting the operational risk of vendor lock-in when AI tooling changes without adequate notice.
The author argues the broader lesson is to avoid single-provider dependency by staying multi-model, using interfaces that support multiple vendors, and testing alternatives regularly as model capabilities shift quickly.

AMD’s AI director just analyzed 6,852 Claude Code sessions, 234,760 tool calls, and 17,871 thinking blocks.

Her conclusion: “Claude cannot be trusted to perform complex engineering tasks.”

Thinking depth dropped 67%. Code reads before edits fell from 6.6 to 2.0. The model started editing files it hadn’t even read.

Stop-hook violations went from zero to 10 per day.

Anthropic admitted they silently changed the default effort level from “high” to “medium” and introduced “adaptive thinking” that lets the model decide how much to reason.

No announcement. No warning.

When users shared transcripts, Anthropic’s own engineer confirmed the model was allocating ZERO thinking tokens on some turns.

The turns with zero reasoning? Those were the ones hallucinating.

AMD’s team has already switched to another provider.

But here’s what most people are missing.

This isn’t just a Claude story.

AMD had 50+ concurrent sessions running on one tool.

Their entire AI compiler workflow was built around Claude Code. One silent update broke everything.

That’s vendor lock-in. And it will keep happening.

→ Every AI company will optimize for their margins, not your workflow

→ Today’s best model is tomorrow’s second choice

→ If your workflow can’t survive a provider switch, you don’t have a workflow. You have a dependency

The fix is simple: stay multi-model.

→ Use tools like Perplexity that let you swap between Claude, GPT, Gemini in one interface

→ Learn prompt engineering that works across models, not tricks tied to one

→ Test alternatives monthly because the rankings shift fast

Laurenzo said it herself: “6 months ago, Claude stood alone. Anthropic is far from alone at the capability tier Opus previously occupied.”

submitted by /u/Infinite-pheonix
[link] [comments]

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/13DailyView insight →

Black Hat USA

AI Business

Black Hat Asia

AI Business

Your developers are already running AI locally: Why on-device inference is the CISO’s new blind spot

VentureBeat

From Batch to Bot: AI for Specialty Food Label Compliance

Dev.to

The Coordination Ceiling in Agentic AI: How Outcome Routing Breaks the Scale Bottleneck

Dev.to

Claude cannot be trusted to perform complex engineering tasks

Key Points

💡 Insights using this article

Related Articles

Black Hat USA

Black Hat Asia

Your developers are already running AI locally: Why on-device inference is the CISO’s new blind spot

From Batch to Bot: AI for Specialty Food Label Compliance

The Coordination Ceiling in Agentic AI: How Outcome Routing Breaks the Scale Bottleneck

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer