El Agente Forjador: Task-Driven Agent Generation for Quantum Simulation

arXiv cs.AI / 4/17/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces El Agente Forjador, a multi-agent framework that autonomously analyzes, generates, validates, and reuses computational tools to speed up scientific workflows.
  • It uses a four-stage loop (tool analysis → tool generation → task execution → iterative solution evaluation) to adapt tool creation to new domains and changing libraries.
  • Experiments on 24 quantum-chemistry and quantum-dynamics tasks across five coding-agent setups compare three modes: task-specific zero-shot tool generation, curriculum-based tool reuse, and baseline direct solving.
  • The authors report that tool generation and reuse improve accuracy versus the baseline, and that reusing toolsets produced by stronger agents can lower API cost while significantly improving weaker agents’ solution quality.
  • Case studies show that tools created for different domains can be combined to address hybrid quantum simulation tasks, supporting a shift toward task-defined agent capabilities.

Abstract

AI for science promises to accelerate the discovery process. The advent of large language models (LLMs) and agentic workflows enables the expediting of a growing range of scientific tasks. However, most of the current generation of agentic systems depend on static, hand-curated toolsets that hinder adaptation to new domains and evolving libraries. We present El Agente Forjador, a multi-agent framework in which universal coding agents autonomously forge, validate, and reuse computational tools through a four-stage workflow of tool analysis, tool generation, task execution, and iterative solution evaluation. Evaluated across 24 tasks spanning quantum chemistry and quantum dynamics on five coding agent setups, we compare three operating modes: zero-shot generation of tools per task, reuse of a curriculum-built toolset, and direct problem-solving with the coding agents as the baseline. We find that our tool generation and reuse framework consistently improves accuracy over the baseline. We also show that reusing a toolset built by a stronger coding agent can reduce API cost and substantially raises the solution quality for weaker coding agents. Case studies further demonstrate that tools forged for different domains can be combined to solve hybrid tasks. Taken together, these results show that LLM-based agents can use their scientific knowledge and coding capabilities to autonomously build reusable scientific tools, pointing toward a paradigm in which agent capabilities are defined by the tasks they are designed to solve rather than by explicitly engineered implementations.