ResearchEVO: An End-to-End Framework for Automated Scientific Discovery and Documentation

arXiv cs.AI / 4/8/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • ResearchEVO proposes an end-to-end pipeline that automates scientific discovery followed by retrospective explanation and documentation, mirroring a two-stage breakthrough workflow.
  • In the Evolution phase, LLM-guided bi-dimensional co-evolution searches for code implementations using fitness only, optimizing both algorithm logic and overall architecture without needing understanding of the produced solutions.
  • In the Writing phase, the system generates publication-ready research papers using sentence-level retrieval-augmented generation with anti-hallucination verification and automated experiment design.
  • The framework is claimed to be the first to jointly cover principled algorithm evolution and literature-grounded scientific writing in a single pipeline.
  • Experiments on quantum error correction (with real Google quantum hardware data) and physics-informed neural networks reportedly yielded newly discovered, human-interpretable mechanisms and produced compilable LaTeX manuscripts with zero fabricated citations.

Abstract

An important recurring pattern in scientific breakthroughs is a two-stage process: an initial phase of undirected experimentation that yields an unexpected finding, followed by a retrospective phase that explains why the finding works and situates it within existing theory. We present ResearchEVO, an end-to-end framework that computationally instantiates this discover-then-explain paradigm. The Evolution Phase employs LLM-guided bi-dimensional co-evolution -- simultaneously optimizing both algorithmic logic and overall architecture -- to search the space of code implementations purely by fitness, without requiring any understanding of the solutions it produces. The Writing Phase then takes the best-performing algorithm and autonomously generates a complete, publication-ready research paper through sentence-level retrieval-augmented generation with explicit anti-hallucination verification and automated experiment design. To our knowledge, ResearchEVO is the first system to cover this full pipeline end to end: no prior work jointly performs principled algorithm evolution and literature-grounded scientific documentation. We validate the framework on two cross-disciplinary scientific problems -- Quantum Error Correction using real Google quantum hardware data, and Physics-Informed Neural Networks -- where the Evolution Phase discovered human-interpretable algorithmic mechanisms that had not been previously proposed in the respective domain literatures. In both cases, the Writing Phase autonomously produced compilable LaTeX manuscripts that correctly grounded these blind discoveries in existing theory via RAG, with zero fabricated citations.