AI Navigate

HyEvo: Self-Evolving Hybrid Agentic Workflows for Efficient Reasoning

arXiv cs.AI / 3/23/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • HyEvo targets the inefficiencies of existing agentic workflows that rely on predefined operator libraries and homogeneous LLM-only pipelines.
  • It introduces a hybrid framework that combines probabilistic LLM nodes for semantic reasoning with deterministic code nodes for rule-based execution to reduce inference cost and latency.
  • HyEvo uses an LLM-driven multi-island evolutionary strategy with a reflect-then-generate mechanism to iteratively refine workflow topology and node logic via execution feedback.
  • Comprehensive experiments show HyEvo outperforming existing methods across diverse reasoning and coding benchmarks, achieving up to 19x cost reductions and 16x latency reductions compared with state-of-the-art baselines.

Abstract

Although agentic workflows have demonstrated strong potential for solving complex tasks, existing automated generation methods remain inefficient and underperform, as they rely on predefined operator libraries and homogeneous LLM-only workflows in which all task-level computation is performed through probabilistic inference. To address these limitations, we propose HyEvo, an automated workflow-generation framework that leverages heterogeneous atomic synthesis. HyEvo integrates probabilistic LLM nodes for semantic reasoning with deterministic code nodes for rule-based execution, offloading predictable operations from LLM inference and reducing inference cost and execution latency. To efficiently navigate the hybrid search space, HyEvo employs an LLM-driven multi-island evolutionary strategy with a reflect-then-generate mechanism, iteratively refining both workflow topology and node logic via execution feedback. Comprehensive experiments show that HyEvo consistently outperforms existing methods across diverse reasoning and coding benchmarks, while reducing inference cost and execution latency by up to 19\times and 16\times, respectively, compared to the state-of-the-art open-source baseline.