FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration

arXiv cs.AI / 4/1/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • FlowPIE is presented as a tightly coupled retrieval-generation framework for scientific idea generation that co-evolves literature exploration and idea generation rather than using a static retrieval-then-generation pipeline.
  • It uses a flow-guided Monte Carlo Tree Search (MCTS) inspired by GFlowNets to expand literature trajectories and builds an initial population guided by an LLM-based generative reward model (GRM) assessed quality signal.
  • FlowPIE then performs test-time idea evolution using selection, crossover, and mutation, with GRM-based fitness computation and an isolation-island paradigm to encourage cross-domain knowledge and reduce homogeneity.
  • The work reports evaluations showing consistently higher novelty, feasibility, and diversity than strong LLM-based and agent-based baselines, and claims the approach supports reward scaling during test time.

Abstract

Scientific idea generation (SIG) is critical to AI-driven autonomous research, yet existing approaches are often constrained by a static retrieval-then-generation paradigm, leading to homogeneous and insufficiently divergent ideas. In this work, we propose FlowPIE, a tightly coupled retrieval-generation framework that treats literature exploration and idea generation as a co-evolving process. FlowPIE expands literature trajectories via a flow-guided Monte Carlo Tree Search (MCTS) inspired by GFlowNets, using the quality of current ideas assessed by an LLM-based generative reward model (GRM) as a supervised signal to guide adaptive retrieval and construct a diverse, high-quality initial population. Based on this population, FlowPIE models idea generation as a test-time idea evolution process, applying selection, crossover, and mutation with the isolation island paradigm and GRM-based fitness computation to incorporate cross-domain knowledge. It effectively mitigates the information cocoons arising from over-reliance on parametric knowledge and static literature. Extensive evaluations demonstrate that FlowPIE consistently produces ideas with higher novelty, feasibility and diversity compared to strong LLM-based and agent-based frameworks, while enabling reward scaling during test time.