From Admission to Invariants: Measuring Deviation in Delegated Agent Systems

arXiv cs.AI / 4/21/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that enforcement-only governance in delegated autonomous agent systems can fail to detect behavioral drift because the enforcement signal is structurally “below” the measurement layer for deviation.
  • It presents a Non-Identifiability Theorem showing that the admissible behavior space A0 set at admission time cannot be determined from the enforcement signal g under a Local Observability Assumption.
  • The core reason for the impossibility is a mismatch between local, point-wise action checks (what g does) and global, trajectory-level properties (what A0 encodes).
  • To address this, the authors define an Invariant Measurement Layer (IML) that retains access to the generative model of A0, enabling detection of admission-time drift with provably finite detection delay.
  • Experiments across multiple drift scenarios, an n8n webhook pipeline, and a LangGraph StateGraph agent show enforcement triggers zero violations while IML detects drift within 9–258 steps.

Abstract

Autonomous agent systems are governed by enforcement mechanisms that flag hard constraint violations at runtime. The Agent Control Protocol identifies a structural limit of such systems: a correctly-functioning enforcement engine can enter a regime in which behavioral drift is invisible to it, because the enforcement signal operates below the layer where deviation is measurable. We show that enforcement-based governance is structurally unable to determine whether an agent's behavior remains within the admissible behavior space A0 established at admission time. Our central result, the Non-Identifiability Theorem, proves that A0 is not in the sigma-algebra generated by the enforcement signal g under the Local Observability Assumption, which every practical enforcement system satisfies. The impossibility arises from a fundamental mismatch: g evaluates actions locally against a point-wise rule set, while A0 encodes global, trajectory-level behavioral properties set at admission time. We define the Invariant Measurement Layer (IML), which bypasses this limitation by retaining direct access to the generative model of A0. We prove an information-theoretic impossibility for enforcement-based monitoring; separately, we show IML detects admission-time drift with provably finite detection delay, operating in the region where enforcement is structurally blind. Validated across four settings: three drift scenarios (300 and 1000 steps), a live n8n webhook pipeline, and a LangGraph StateGraph agent -- enforcement triggers zero violations while IML detects each drift type within 9-258 steps. Paper 2 of a 4-paper Agent Governance Series: atomic boundaries (P0, 10.5281/zenodo.19642166), ACP enforcement (P1, arXiv:2603.18829), fair allocation (P3, 10.5281/zenodo.19643928), irreducibility (P4, 10.5281/zenodo.19643950).