AgentGA: Evolving Code Solutions in Agent-Seed Space

arXiv cs.AI / 4/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • AgentGA is a new framework that improves autonomous code-generation by optimizing an “agent seed,” defined as the task prompt plus optional parent archives that pre-initialize a workspace.
  • Instead of directly editing code, AgentGA uses an outer evolutionary loop that searches over reusable starting conditions and spawns fresh long-horizon autonomous runs from reset workspaces.
  • The approach combines a population-level genetic algorithm with long-horizon agents, using deterministic 1:1 elite tournaments for selection and an online-adapted operator allocation via a modified Hedge controller.
  • On the Weco-Kaggle Lite benchmark for tabular AutoML, AgentGA reports an average of 74.52% “Exceeds % of Human,” outperforming AIDE’s 54.15% across 10 runs.
  • Results from 1,135 parent-child comparisons show that descendants leveraging parent archives outperform scratch starts, suggesting inherited artifacts make later autonomous runs more effective.

Abstract

We present AgentGA, a framework that evolves autonomous code-generation runs by optimizing the agent seed: the task prompt plus optional parent archives that initialize a fresh workspace. The outer loop searches over these reusable starting conditions rather than editing code directly. Each generation launches a fresh autonomous run from a reset workspace, while selected parent archives provide inherited artifacts that descendants can inspect and reuse. AgentGA couples a population-level genetic algorithm with long-horizon agents; selection uses deterministic 1:1 elite tournaments and operator allocation is adapted online with a modified Hedge controller. We instantiate the approach for tabular AutoML on the 16-competition Weco-Kaggle Lite benchmark. On the 10 benchmark runs reported here, AgentGA averages 74.52% Exceeds % of Human versus 54.15% for AIDE. Across 1135 parent-child comparisons, descendants given parent archives outperform runs started from scratch, indicating that inherited artifacts improve later autonomous runs. These findings support agent-seed optimization as a practical design point for autonomous code-search systems.