Autonoma: A Hierarchical Multi-Agent Framework for End-to-End Workflow Automation

arXiv cs.LG / 3/23/2026

💬 OpinionDeveloper Stack & InfrastructureModels & Research

Key Points

  • Autonoma introduces a hierarchical, multi-agent framework that translates open-ended natural language prompts into end-to-end workflows.
  • The architecture includes a Coordinator to validate user intent, a Planner to generate structured workflows, and a Supervisor that orchestrates modular agents for tasks such as web browsing, coding, and file management.
  • By separating orchestration from execution and enabling active monitoring with plug-and-play agents, Autonoma improves robustness, scalability, and extensibility while addressing data privacy in a secure LAN environment.
  • It supports multi-modal inputs (text, voice, image, files) and English and Arabic, and reports a 97% task completion rate and a 98% agent handoff rate in evaluations.

Abstract

The increasing complexity of user demands necessitates automation frameworks that can reliably translate open-ended instructions into robust, multi-step workflows. Current monolithic agent architectures often struggle with the challenges of scalability, error propagation, and maintaining focus across diverse tasks. This paper introduces Autonoma, a structured, hierarchical multi-agent framework designed for end-to-end workflow automation from natural language prompts. Autonoma employs a principled, multi-tiered architecture where a high-level Coordinator validates user intent, a Planner generates structured workflows, and a Supervisor dynamically manages the execution by orchestrating a suite of modular, specialized agents (e.g., for web browsing, coding, file management). This clear separation between orchestration logic and specialized execution ensures robustness through active monitoring and error handling, while enabling extensibility by allowing new capabilities to be integrated as plug-and-play agents without modifying the core engine. Implemented as a fully functional system operating within a secure LAN environment, Autonoma addresses critical data privacy and reliability concerns. The system is further engineered for inclusivity, accepting multi-modal input (text, voice, image, files) and supporting both English and Arabic. Autonoma achieved a 97% task completion rate and a 98% successful agent handoff rate, confirming its operational reliability and efficient collaboration.