Mol-Debate: Multi-Agent Debate Improves Structural Reasoning in Molecular Design

arXiv cs.AI / 4/23/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • Mol-Debate addresses a gap in text-guided molecular design by improving how systems align sequential natural-language instructions with non-linear molecular structures while respecting strict chemical constraints.
  • Instead of relying on a mostly one-shot generation pipeline, it uses an iterative generate–debate–refine loop with multi-perspective critique to reconcile semantic intent and structural feasibility.
  • The method introduces perspective-oriented orchestration to handle issues such as developer–debater conflict, global–local structural reasoning, and integrating static and dynamic information during refinement.
  • Experiments on ChEBI-20 and S$^2$-Bench show state-of-the-art results, reporting 59.82% exact match on ChEBI-20 and a 50.52% weighted success rate on S$^2$-Bench.
  • The authors provide an open-source implementation at the linked GitHub repository.

Abstract

Text-guided molecular design is a key capability for AI-driven drug discovery, yet it remains challenging to map sequential natural-language instructions with non-linear molecular structures under strict chemical constraints. Most existing approaches, including RAG, CoT prompting, and fine-tuning or RL, emphasize a small set of ad-hoc reasoning perspectives implemented in a largely one-shot generation pipeline. In contrast, real-world drug discovery relies on dynamic, multi-perspective critique and iterative refinement to reconcile semantic intent with structural feasibility. Motivated by this, we propose Mol-Debate, a generation paradigm that enables such dynamic reasoning through an iterative generate-debate-refine loop. We further characterize key challenges in this paradigm and address them through perspective-oriented orchestration, including developer-debater conflict, global-local structural reasoning, and static-dynamic integration. Experiments demonstrate that Mol-Debate achieves state-of-the-art performance against strong general and chemical baselines, reaching 59.82% exact match on ChEBI-20 and 50.52% weighted success rate on S^2-Bench. Our code is available at https://github.com/wyuzh/Mol-Debate.