AI Navigate

Protein Design with Agent Rosetta: A Case Study for Specialized Scientific Agents

arXiv cs.AI / 3/18/2026

📰 NewsTools & Practical UsageModels & Research

Key Points

  • Agent Rosetta combines an LLM agent with a Rosetta-based environment to enable protein design involving canonical and non-canonical residues.
  • The system iteratively refines designs to meet user-defined objectives by combining LLM reasoning with Rosetta's physics-based modeling.
  • Evaluation shows Agent Rosetta matches specialized models and expert baselines for canonical amino acids and extends to non-canonical residues where ML approaches fail.
  • The study highlights that prompt engineering alone often fails to generate Rosetta actions, underscoring the importance of environment design for integrating LLMs with specialized scientific software.

Abstract

Large language models (LLMs) are capable of emulating reasoning and using tools, creating opportunities for autonomous agents that execute complex scientific tasks. Protein design provides a natural testbed: although machine learning (ML) methods achieve strong results, these are largely restricted to canonical amino acids and narrow objectives, leaving unfilled need for a generalist tool for broad design pipelines. We introduce Agent Rosetta, an LLM agent paired with a structured environment for operating Rosetta, the leading physics-based heteropolymer design software, capable of modeling non-canonical building blocks and geometries. Agent Rosetta iteratively refines designs to achieve user-defined objectives, combining LLM reasoning with Rosetta's generality. We evaluate Agent Rosetta on design with canonical amino acids, matching specialized models and expert baselines, and with non-canonical residues -- where ML approaches fail -- achieving comparable performance. Critically, prompt engineering alone often fails to generate Rosetta actions, demonstrating that environment design is essential for integrating LLM agents with specialized software. Our results show that properly designed environments enable LLM agents to make scientific software accessible while matching specialized tools and human experts.