AI scientists produce results without reasoning scientifically [R]

Reddit r/MachineLearning / 4/23/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

Researchers conducted 25,000 experiments with “AI scientists” and found that the systems often produced conclusions without properly following scientific reasoning or updating beliefs based on evidence.
In 68% of cases, the AI gathered evidence but then ignored it entirely, and in 71% of cases it never updated its beliefs, with hypothesis revision occurring only 26% of the time when data contradicted.
The study argues that unlike human scientists who adapt their approach to different tasks and evidence types, the AI used the same undisciplined loop repeatedly.
A commonly proposed remedy—improving “scaffolding”—did not solve the underlying problem, suggesting that current agent prompting/tool-routing improvements may be insufficient.
The findings point to a need for new ways to ensure that AI research agents actually incorporate, reconcile, and revise beliefs in response to contradictory evidence.

Researchers ran 25,000 AI scientist experiments and discovered something that need attention!!

AI scientists are producing results without doing science.

68% of times, the AI gathered evidence and then completely ignored it. 71% times the AI never updated its beliefs at all. Not once. Only 26% of the time did the AI revise a hypothesis when confronted with contradictory data.

A human scientist adapts. You approach a chemistry identification problem differently than you approach a simulation workflow. The AI doesn't. It runs the same undisciplined loop every time.

The researchers also showed the most popular proposed fix: better scaffolding do not work.

Everyone building AI research agents has focused on engineering better prompting frameworks, better tool routing, better agent architectures. ReAct, structured tool-calling, chain-of-thought, all of it.

alphaxiv

arxiv

submitted by /u/Okra3268
[link] [comments]

Why Your Brand Is Invisible to ChatGPT (And How to Fix It)

Dev.to

No Free Lunch Theorem — Deep Dive + Problem: Reverse Bits

Dev.to

Salesforce Headless 360: Run Your CRM Without a Browser

Dev.to

RAG Systems in Production: Building Enterprise Knowledge Search

Dev.to

What Is the Difference Between Native and Cross-Platform App Development in 2026?

Dev.to

AI scientists produce results without reasoning scientifically [R]

Key Points

Related Articles

Why Your Brand Is Invisible to ChatGPT (And How to Fix It)

No Free Lunch Theorem — Deep Dive + Problem: Reverse Bits

Salesforce Headless 360: Run Your CRM Without a Browser

RAG Systems in Production: Building Enterprise Knowledge Search

What Is the Difference Between Native and Cross-Platform App Development in 2026?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer