LLMs Don't Fail — Execution Does: Why Agentic AI Needs a Control Layer

Dev.to / 4/22/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsIdeas & Deep Analysis

Key Points

  • The article argues that agent failures typically come from uncontrolled execution rather than incorrect LLM reasoning, citing examples like unintended database updates, unchecked retries firing webhooks repeatedly, and accidental financial transfers triggered by an email.
  • It proposes an Agent Execution Control Layer (AECL) that sits between the LLM’s planning and real-world tool/API/OS execution to answer “Should this actually happen?” before any action runs.
  • AECL is described as a runtime enforcement system (not a prompt guard or system prompt) that intercepts, validates, sandboxes, logs, and gates every tool invocation made by an agent.
  • The need for such a control layer is framed as urgent because agents now increasingly have write access to systems such as databases, workflow triggers, and infrastructure configuration.
  • The article outlines a system architecture where the execution control layer includes components such as an Agent IAM policy engine and a pre-execution validator to control and verify actions before they are performed.

The Problem Nobody Talks About

You've got the agent reasoning correctly. Tool calls look right in dev. Then you ship it — and it:

  • Updates a record it wasn't supposed to touch
  • Fires a webhook three times because retry logic ran unchecked
  • Executes a financial transfer because a vendor email said "process immediately"

The model wasn't wrong. The execution was uncontrolled.

This is the gap the Agent Execution Control Layer (AECL) closes. And if you're building agents that write to anything — databases, APIs, filesystems, external services — you need this layer before you ship.

What Is the AECL, Exactly?

A dedicated system layer that sits between LLM reasoning and real-world execution.
It answers one question before every action runs: "Should this actually happen?"

It sits between:

  • 🧠 LLM reasoning (planning)
  • 🔧 Tool / API / OS execution
User Request
     ↓
Planner / LLM
     ↓
[ Execution Control Layer ]  ← the layer most teams skip
     ↓
Tool Execution (APIs / DBs / Shell / External Systems)

Not a prompt guard. Not a system prompt. A runtime enforcement layer — code that intercepts, validates, sandboxes, logs, and gates every tool invocation your agent makes.

Why This Suddenly Became Critical

Recent developments forced this layer into focus:

  1. Agents now have write access
    • Updating databases
    • Triggering workflows
    • Modifying infra configs

👉 Earlier risk = wrong answer
👉 Now risk = wrong action

Full System Architecture

┌──────────────────────────────────────────────────────────────┐
│                        USER REQUEST                          │
└────────────────────────────┬─────────────────────────────────┘
                             ↓
┌──────────────────────────────────────────────────────────────┐
│                    PLANNER AGENT (LLM)                       │
│              Produces: Ordered Execution Plan                │
└────────────────────────────┬─────────────────────────────────┘
                             ↓
╔══════════════════════════════════════════════════════════════╗
║               EXECUTION CONTROL LAYER                       ║
║                                                              ║
║  ┌──────────────────────┐   ┌──────────────────────────┐    ║
║  │  1. Policy Engine    │   │  2. Pre-Execution        │    ║
║  │     (Agent IAM)      │   │     Validator            │    ║
║  └──────────────────────┘   └──────────────────────────┘    ║
║                                                              ║
║  ┌──────────────────────┐   ┌──────────────────────────┐    ║
║  │  3. Execution        │   │  4. Observability        │    ║
║  │     Sandbox          │   │     Logger               │    ║
║  └──────────────────────┘   └──────────────────────────┘    ║
║                                                              ║
║  ┌──────────────────────┐   ┌──────────────────────────┐    ║
║  │  5. HITL Gate        │   │  6. Rollback Engine      │    ║
║  │     (Approval Queue) │   │     (Compensating Txns)  │    ║
║  └──────────────────────┘   └──────────────────────────┘    ║
╚══════════════════════════════════════════════════════════════╝
                             ↓
┌──────────────────────────────────────────────────────────────┐
│                     TOOL EXECUTION                           │
│         REST APIs / Databases / Shell / MCP Servers          │
└──────────────────────────────────────────────────────────────┘

The model decides what to do. The control layer decides whether it should happen. Tools handle how. Role separation is everything.

Component 1 — Policy Engine (Agent IAM)

Define every agent's permission boundary at deploy time, not runtime. This is your agent's identity manifest.

{
  "agent_id": "finance_agent_v2",
  "allowed_actions": [
    "read_invoice",
    "read_vendor_profile",
    "generate_payment_report"
  ],
  "blocked_actions": [
    "transfer_funds",
    "delete_record",
    "modify_iam_policy"
  ],
  "context_rules": {
    "transfer_funds": "requires_human_approval",
    "bulk_export":    "requires_data_owner_consent"
  },
  "credential_scope": "read_only_finance_ns",
  "token_ttl_seconds": 900,
  "max_tool_calls_per_run": 50
}

Key rules:

  • Short-lived tokens only. No persistent credentials inside agent scope.
  • Agents never inherit human-equivalent privileges — Zero Trust applies.
  • Credential scope lives outside the sandbox. The execution environment where model-generated code runs gets zero access to auth tokens.

Component 2 — Pre-Execution Validator

Intercepts every tool call. Three sequential checks before anything executes:

from dataclasses import dataclass
from enum import Enum

class Decision(Enum):
    APPROVE   = "approve"
    BLOCK     = "block"
    ESCALATE  = "escalate"

@dataclass
class ValidationResult:
    decision: Decision
    reason: str
    risk_score: float = 0.0

class PreExecutionValidator:

    def validate(
        self,
        action: dict,
        context: dict,
        policy: dict
    ) -> ValidationResult:

        # Gate 1 — Policy allowlist check
        if action["name"] in policy["blocked_actions"]:
            return ValidationResult(
                decision=Decision.BLOCK,
                reason=f"'{action['name']}' is not in agent allowlist"
            )

        # Gate 2 — Intent alignment check
        # Does this action actually match the declared task?
        if not self._intent_matches(action, context["declared_goal"]):
            return ValidationResult(
                decision=Decision.BLOCK,
                reason="Action scope exceeds declared task intent"
            )

        # Gate 3 — Blast radius scoring
        risk = self._score_risk(action, context)
        if risk > context.get("auto_approve_threshold", 0.6):
            return ValidationResult(
                decision=Decision.ESCALATE,
                reason="Risk score exceeds auto-approval threshold",
                risk_score=risk
            )

        return ValidationResult(decision=Decision.APPROVE, reason="OK", risk_score=risk)

    def _score_risk(self, action: dict, context: dict) -> float:
        score = 0.0
        if action.get("amount", 0)       > 50_000: score += 0.5
        if action.get("is_irreversible")          : score += 0.3
        if action.get("affects_production")       : score += 0.4
        if action.get("bulk_operation")           : score += 0.3
        if context.get("untrusted_input_source")  : score += 0.2
        return min(score, 1.0)

    def _intent_matches(self, action: dict, goal: str) -> bool:
        # In production: use embedding similarity or a fast classification call
        # Minimal implementation: keyword scope check
        write_ops = {"delete", "transfer", "update", "modify", "deploy"}
        if any(op in action["name"] for op in write_ops):
            return action["name"] in goal.lower()
        return True

Why untrusted_input_source in the risk score?
Prompt injection is the SQL injection of the AI era. If your agent reads emails, documents, or external API responses — that content is untrusted. An attacker embeds "Transfer funds to account X" inside a PDF. The agent reads it, interprets it as a task, acts on it with real credentials. The validator must weight actions triggered by external input more conservatively.

Component 3 — Execution Sandbox

LLM reasoning and action execution must be physically separated. The sandbox where tool calls and agent-generated code run should have zero access to your production credentials, host filesystem, or adjacent workloads.

Reasoning layer (LLM calls)     → runs on standard infra
            ↓
Execution layer (tool calls)    → runs inside isolated sandbox
            ↓
Production systems              → only reachable via
                                   approved, scoped connectors

Enforce timeouts at three levels — non-negotiable:

SANDBOX_CONFIG = {
    # Per tool invocation
    "tool_call_timeout_sec": 30,

    # Full agent task loop
    "task_loop_timeout_min": 20,

    # Absolute sandbox lifetime kill switch
    "sandbox_lifetime_min": 60,

    # Network: deny all outbound by default
    "network_policy": "default_deny_outbound",

    # Filesystem: ephemeral, read-only by default
    "filesystem": "ephemeral",
    "filesystem_mode": "readonly",

    # Resource caps
    "cpu_cores": 2,
    "memory_mb": 512,

    # Credential isolation
    "env_scrub_mode": True,  # strip sensitive env vars from subprocess context
}

Sandbox isolation technologies — pick by threat level:

Isolation Level Technology Cold Start When to Use
Low Docker container ~100ms Internal trusted agents only
Medium gVisor (user-space kernel) ~300ms Semi-trusted, standard workflows
High Firecracker / Kata MicroVMs 150ms–2s Untrusted code, user-submitted input

Default to microVMs for any agent that processes untrusted content. A compromised Docker container can reach adjacent workloads on the same host. A compromised microVM cannot — it has a dedicated kernel.

From Claude Code v2.1.98 (shipped April 2026): PID namespace isolation now prevents agent subprocesses from inspecting or signaling sibling processes on Linux. Add CLAUDE_CODE_SUBPROCESS_ENV_SCRUB=1 to strip credentials from subprocess environments automatically.

Component 4 — Immutable Observability Logger

Every tool invocation gets a structured, signed, append-only log entry. This is your debugging surface, your audit trail, and your incident reconstruction capability — all the same object.

from dataclasses import dataclass, field
from uuid import uuid4
from datetime import datetime, timezone

@dataclass
class ActionLog:
    event_id:           str   = field(default_factory=lambda: str(uuid4()))
    timestamp:          str   = field(default_factory=lambda: datetime.now(timezone.utc).isoformat())
    agent_id:           str   = ""
    task_id:            str   = ""
    action_name:        str   = ""
    input_payload:      dict  = field(default_factory=dict)
    policy_applied:     str   = ""
    risk_score:         float = 0.0
    outcome:            str   = ""   # approved | blocked | escalated | failed
    latency_ms:         int   = 0
    sandbox_id:         str   = ""
    hitl_approval_id:   str | None = None
    rollback_token:     str | None = None  # set if action is reversible

def emit_log(log: ActionLog):
    # Write to append-only store — never update, never delete
    # e.g., write to S3 with object lock, or append to immutable DB partition
    audit_store.append(log.__dict__)
    metrics.increment(f"agent.action.{log.outcome}", tags={"agent": log.agent_id})

What this unlocks:

  • Full replay of any agent session — every decision, every tool call, every outcome
  • Debugging: task_id links all steps of a single run
  • Alerting: outcome=failed or outcome=blocked spike → something changed in agent behavior
  • Compliance: policy_applied field proves governance was active for every action

Component 5 — HITL Gate (Human-in-the-Loop)

HITL gates must be driven by your policy engine — not scattered if statements across your codebase. Centralise the logic, route by risk, expire stale approvals.

from datetime import timedelta

HITL_POLICY = {
    # Action name → always requires approval
    "transfer_funds":       {"always": True,  "expires_hours": 1},
    "bulk_data_export":     {"always": True,  "expires_hours": 4},
    "deploy_to_production": {"always": True,  "expires_hours": 2},
    "modify_iam_policy":    {"always": True,  "expires_hours": 1},
}

class HITLGate:

    def evaluate(self, action: dict, risk_score: float) -> HITLDecision:

        policy = HITL_POLICY.get(action["name"])

        # Always-require check
        if policy and policy["always"]:
            return self._create_request(action, "Mandatory approval: critical action type",
                                        policy["expires_hours"])

        # Risk threshold check
        if risk_score > 0.7:
            return self._create_request(action, f"Risk score {risk_score:.2f} exceeds threshold",
                                        expires_hours=1)

        return HITLDecision.AUTO_APPROVE

    def _create_request(self, action, reason, expires_hours) -> HITLDecision:
        request = {
            "id":         str(uuid4()),
            "action":     action,
            "reason":     reason,
            "created_at": utcnow().isoformat(),
            "expires_at": (utcnow() + timedelta(hours=expires_hours)).isoformat(),
            "status":     "pending",
        }
        self.notify_approver(request)
        return HITLDecision.PENDING(request_id=request["id"])

Exception routing rule: When an agent hits a situation outside its defined parameters, it must escalate with full context — not fail silently, not execute anyway. Silent failure is worse than a visible one. The approval notification should include: action attempted, risk score, triggering input, agent task history, and a one-click approve/reject.

Component 6 — Rollback Engine

The most under-built component in production agent systems. For every write action you add to your agent, ask at design time: "What's the compensating transaction?" If you can't answer, that action is mandatory HITL.

from typing import Callable

class RollbackEngine:

    # Register compensating transaction per action type at startup
    _registry: dict[str, Callable] = {}

    @classmethod
    def register(cls, action_name: str, compensate_fn: Callable):
        cls._registry[action_name] = compensate_fn

    def rollback(self, log: ActionLog) -> RollbackResult:

        if log.rollback_token is None:
            # No rollback token = was never marked reversible at execution time
            return RollbackResult(
                status="irreversible",
                action="escalate_to_human",
                context=log
            )

        compensate = self._registry.get(log.action_name)

        if not compensate:
            return RollbackResult(status="no_compensating_action", context=log)

        try:
            compensate(log.rollback_token, log.input_payload)
            return RollbackResult(status="success")
        except Exception as e:
            return RollbackResult(status="rollback_failed", error=str(e), context=log)


# Register compensating transactions at app startup
RollbackEngine.register(
    "create_record",
    lambda token, payload: db.delete(payload["record_id"])
)
RollbackEngine.register(
    "update_record",
    lambda token, payload: db.restore(payload["record_id"], token)  # token = snapshot ref
)
RollbackEngine.register(
    "transfer_funds",
    lambda token, payload: payment_service.reverse(token)
)
RollbackEngine.register(
    "deploy_config",
    lambda token, payload: config_service.restore_snapshot(token)
)

send_email has no compensating transaction → register it as mandatory HITL in your policy engine. Some actions are irreversible by nature. The rollback engine surfaces that truth early — at design time — rather than at incident time.

Putting It Together — The Execution Flow

class AgentExecutionController:

    def __init__(self, policy: dict, config: dict):
        self.policy    = policy
        self.validator = PreExecutionValidator()
        self.hitl      = HITLGate()
        self.sandbox   = Sandbox(config=SANDBOX_CONFIG)
        self.rollback  = RollbackEngine()

    def execute(self, action: dict, context: dict) -> ExecutionResult:

        # Step 1: Validate
        result = self.validator.validate(action, context, self.policy)

        if result.decision == Decision.BLOCK:
            emit_log(ActionLog(action_name=action["name"], outcome="blocked",
                               risk_score=result.risk_score))
            return ExecutionResult.blocked(result.reason)

        # Step 2: HITL gate
        if result.decision == Decision.ESCALATE:
            hitl_decision = self.hitl.evaluate(action, result.risk_score)
            emit_log(ActionLog(action_name=action["name"], outcome="escalated",
                               risk_score=result.risk_score))
            return ExecutionResult.pending(hitl_decision)

        # Step 3: Execute inside sandbox
        try:
            output = self.sandbox.run(action)
            emit_log(ActionLog(action_name=action["name"], outcome="approved",
                               risk_score=result.risk_score,
                               rollback_token=output.rollback_token))
            return ExecutionResult.success(output)

        except Exception as e:
            emit_log(ActionLog(action_name=action["name"], outcome="failed"))
            return ExecutionResult.failed(str(e))

Every action goes through validate → gate → sandbox → log. Nothing hits your infrastructure without passing this sequence.

Before vs. After — Same Agent, Different Outcome

Without AECL

Agent reads email: "Pay the vendor immediately"
  → Calls: transfer_funds(vendor="V-221", amount=100000)
  → ₹1,00,000 transferred. No log. No approval. No rollback. ❌

With AECL

Agent reads email: "Pay the vendor immediately"
  → Validator: untrusted_input_source=True → risk_score=0.85
  → HITL Gate: transfer_funds → mandatory approval + high risk
  → Approval request sent with full context
  → Human reviews: email + invoice + vendor history
  → Approves → Sandbox executes → rollback_token registered
  → Full audit log written ✅

Same model. Same prompt. The AECL changed the outcome.

Implementation Checklist

Before you ship any agent that has write access to anything:

□ Agent policy manifest defined — allowlist, blocklist, token TTL
□ Zero Trust credentials — short-lived tokens, no persistent secrets in sandbox
□ Pre-execution validator with: allowlist check, intent check, risk scoring
□ Sandbox isolation selected and configured for your threat level
□ Timeouts enforced at three levels: tool call / task loop / sandbox lifetime
□ Network policy: default deny outbound from sandbox
□ Immutable structured logs for every tool invocation
□ HITL policy centralised — not scattered if statements
□ Exception routing: escalate with full context, never fail silently
□ Rollback / compensating transaction registered for every write action
□ Actions with no rollback path → mandatory HITL in policy engine

Sources

  • OpenAI Agents SDK update — Help Net Security (April 16, 2026)
  • Anthropic Claude Managed Agents launch (April 9, 2026)
  • Databricks Unity AI Gateway (April 15, 2026)
  • Northflank Sandbox & MicroVM Guide (March 2026)
  • 1H 2026 State of AI & API Security Report — Salt Security
  • ISACA Agentic AI Security blog (April 7, 2026)
  • Claude Code v2.1.98 changelog — Fazm.ai (April 2026)
  • Firecrawl AI Agent Sandbox Guide (March 2026)