AI Navigate

MoRI: Learning Motivation-Grounded Reasoning for Scientific Ideation in Large Language Models

arXiv cs.CL / 3/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • MoRI introduces a framework that enables LLMs to explicitly learn the reasoning process from research motivations to scientific methodologies for ideation.
  • The approach first uses supervised fine-tuning to generate a research motivation from context, and then trains with reinforcement learning using a composite reward that combines entropy-aware information gain to elicit high-complexity, ground-truth-grounded technical details and a contrastive semantic gain to keep the reasoning aligned with scientifically valid solutions.
  • Empirical results show MoRI significantly outperforms strong commercial LLMs and complex agentic baselines across dimensions like novelty, technical rigor, and feasibility.
  • The authors will release the code on GitHub to support reproducibility and broader adoption.

Abstract

Scientific ideation aims to propose novel solutions within a given scientific context. Existing LLM-based agentic approaches emulate human research workflows, yet inadequately model scientific reasoning, resulting in surface-level conceptual recombinations that lack technical depth and scientific grounding. To address this issue, we propose \textbf{MoRI} (\textbf{Mo}tivation-grounded \textbf{R}easoning for Scientific \textbf{I}deation), a framework that enables LLMs to explicitly learn the reasoning process from research motivations to methodologies. The base LLM is initialized via supervised fine-tuning to generate a research motivation from a given context, and is subsequently trained under a composite reinforcement learning reward that approximates scientific rigor: (1) entropy-aware information gain encourages the model to uncover and elaborate high-complexity technical details grounded in ground-truth methodologies, and (2) contrastive semantic gain constrains the reasoning trajectory to maintain conceptually aligned with scientifically valid solutions. Empirical results show that MoRI significantly outperforms strong commercial LLMs and complex agentic baselines across multiple dimensions, including novelty, technical rigor, and feasibility. The code will be made available on \href{https://github.com/ECNU-Text-Computing/IdeaGeneration}{GitHub}.