AgentSPEX: An Agent SPecification and EXecution Language

arXiv cs.CL / 4/16/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

共有:

Key Points

AgentSPEXは、LLMエージェントのワークフローを「反応的プロンプト」ではなく、明示的な制御フローと状態管理を備えた言語で仕様化・実行するための提案です。
Typed steps、分岐・ループ、並列実行、再利用可能なサブモジュール、モジュール化された明示状態により、既存のオーケストレーションの暗黙性や保守性の課題を補います。
AgentSPEXはエージェント・ハーネス（ツールアクセス、サンドボックス環境、チェックポイント、検証、ログ）上でワークフローを実行し、運用面の可観測性も重視しています。
可視化エディタ（グラフとワークフローの同期表示）を提供し、著作・検査を支援するほか、深層/科学研究向けの既製エージェントと7ベンチマーク評価、ユーザースタディで解釈可能性・利用しやすさを示しています。

Abstract

Language-model agent systems commonly rely on reactive prompting, in which a single instruction guides the model through an open-ended sequence of reasoning and tool-use steps, leaving control flow and intermediate state implicit and making agent behavior potentially difficult to control. Orchestration frameworks such as LangGraph, DSPy, and CrewAI impose greater structure through explicit workflow definitions, but tightly couple workflow logic with Python, making agents difficult to maintain and modify. In this paper, we introduce AgentSPEX, an Agent SPecification and EXecution Language for specifying LLM-agent workflows with explicit control flow and modular structure, along with a customizable agent harness. AgentSPEX supports typed steps, branching and loops, parallel execution, reusable submodules, and explicit state management, and these workflows execute within an agent harness that provides tool access, a sandboxed virtual environment, and support for checkpointing, verification, and logging. Furthermore, we provide a visual editor with synchronized graph and workflow views for authoring and inspection. We include ready-to-use agents for deep research and scientific research, and we evaluate AgentSPEX on 7 benchmarks. Finally, we show through a user study that AgentSPEX provides a more interpretable and accessible workflow-authoring paradigm than a popular existing agent framework.