Competing with AI Scientists: Agent-Driven Approach to Astrophysics Research

arXiv cs.AI / 4/14/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes an agent-driven, multi-agent workflow (Cmbagent) that autonomously generates ideas, writes and executes code, evaluates outputs, and iteratively refines scientific parameter-inference pipelines.
  • Using the FAIR Universe Weak Lensing Uncertainty Challenge as a time-constrained case study, the authors show that fully autonomous runs initially lag expert performance but improve significantly with human intervention.
  • With semi-autonomous operation plus human guidance, the team achieved a first-place result in the competition, indicating agentic systems can compete with expert solutions in practice.
  • The final pipeline combines parameter-efficient convolutional neural networks with likelihood calibration across a known parameter grid and multiple regularization techniques.
  • The authors conclude that semi-autonomous agentic research workflows can scale to rapidly explore and construct inference pipelines for scientific problems.

Abstract

We present an agent-driven approach to the construction of parameter inference pipelines for scientific data analysis. Our method leverages a multi-agent system, Cmbagent (the analysis system of the AI scientist Denario), in which specialized agents collaborate to generate research ideas, write and execute code, evaluate results, and iteratively refine the overall pipeline. As a case study, we apply this approach to the FAIR Universe Weak Lensing Uncertainty Challenge, a competition under time constraints focused on robust cosmological parameter inference with realistic observational uncertainties. While the fully autonomous exploration initially did not reach expert-level performance, the integration of human intervention enabled our agent-driven workflow to achieve a first-place result in the challenge. This demonstrates that semi-autonomous agentic systems can compete with, and in some cases surpass, expert solutions. We describe our workflow in detail, including both the autonomous and semi-autonomous exploration by Cmbagent. Our final inference pipeline utilizes parameter-efficient convolutional neural networks, likelihood calibration over a known parameter grid, and multiple regularization techniques. Our results suggest that agent-driven research workflows can provide a scalable framework to rapidly explore and construct pipelines for inference problems.