ANX: Protocol-First Design for AI Agent Interaction with a Supporting 3EX Decoupled Architecture

arXiv cs.AI / 4/7/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces ANX, an open, extensible, verifiable protocol-first framework for AI agent interactions designed to address inefficiencies and security gaps in GUI automation and MCP-based skills.
  • ANX’s architecture combines agent-native components (ANX Config/Markup/CLI) to improve information density, reduce token usage, and prevent interaction inconsistencies.
  • It proposes a human–agent interaction model where Skill-like capabilities support both agent-executable instructions and human-readable UI, while also adding human-only confirmation to mitigate automated misuse.
  • The framework includes MCP-supported on-demand lightweight apps without pre-registration and uses ANX Markup to create unambiguous, machine-executable SOPs for reliable long-horizon tasks and multi-agent collaboration.
  • Preliminary experiments with Qwen3.5-plus and GPT-4o report token reductions of roughly 47–66% versus MCP skills or GUI automation, alongside execution-time improvements of about 58% in the tested scenarios.

Abstract

AI agents, autonomous digital actors, need agent-native protocols; existing methods include GUI automation and MCP-based skills, with defects of high token consumption, fragmented interaction, inadequate security, due to lacking a unified top-level framework and key components, each independent module flawed. To address these issues, we present ANX, an open, extensible, verifiable agent-native protocol and top-level framework integrating CLI, Skill, MCP, resolving pain points via protocol innovation, architectural optimization and tool supplementation. Its four core innovations: 1) Agent-native design (ANX Config, Markup, CLI) with high information density, flexibility and strong adaptability to reduce tokens and eliminate inconsistencies; 2) Human-agent interaction combining Skill's flexibility for dual rendering as agent-executable instructions and human-readable UI; 3) MCP-supported on-demand lightweight apps without pre-registration; 4) ANX Markup-enabled machine-executable SOPs eliminating ambiguity for reliable long-horizon tasks and multi-agent collaboration. As the first in a series, we focus on ANX's design, present its 3EX decoupled architecture with ANXHub and preliminary feasibility analysis and experimental validation. ANX ensures native security: LLM-bypassed UI-to-Core communication keeps sensitive data out of agent context; human-only confirmation prevents automated misuse. Form-filling experiments with Qwen3.5-plus/GPT-4o show ANX reduces tokens by 47.3% (Qwen3.5-plus) and 55.6% (GPT-4o) vs MCP-based skills, 57.1% (Qwen3.5-plus) and 66.3% (GPT-4o) vs GUI automation, and shortens execution time by 58.1% and 57.7% vs MCP-based skills.