Think it, Run it: Autonomous ML pipeline generation via self-healing multi-agent AI

arXiv cs.AI / 5/1/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper proposes a unified five-agent architecture that automatically generates end-to-end ML pipelines from datasets and natural-language goals.
  • It combines code-grounded RAG to understand microservices, an explainable hybrid recommender for selecting components, and an execution engine that builds and runs DAG-based pipelines.
  • A self-healing mechanism uses LLM-based error interpretation plus adaptive learning from prior execution history to improve robustness when failures occur.
  • In experiments covering 150 ML tasks across varied scenarios, the system reportedly achieves an 84.7% end-to-end pipeline success rate, outperforming baseline approaches.
  • The authors argue that tightly integrated intelligent modules (RAG + explainable recommendation + self-healing + adaptive learning) can outperform designs that treat these components as separate, isolated solutions.

Abstract

The purpose of our paper is to develop a unified multi-agent architecture that automates end-to-end machine learning (ML) pipeline generation from datasets and natural-language (NL) goals, improving efficiency, robustness and explainability. A five-agent system is proposed to handle profiling, intent parsing, microservice recommendation, Directed Acyclic Graph (DAG) construction and execution. It integrates code-grounded Retrieval-Augmented Generation (RAG) for microservice understanding, an explainable hybrid recommender combining multiple criteria, a self-healing mechanism using Large Language Model (LLM)-based error interpretation and adaptive learning from execution history. The approach is evaluated on 150 ML tasks across diverse scenarios. The system achieves an 84.7% end-to-end pipeline success rate, outperforming baseline methods. It demonstrates improved robustness through self-healing and reduces workflow development time compared to manual construction. The study introduces a novel integration of code-grounded RAG, explainable recommendation, self-healing execution and adaptive learning within a single architecture, showing that tightly coupled intelligent components can outperform isolated solutions.