We built a fully deterministic control layer for agents. Would love feedback. No pitch

Reddit r/artificial / 3/30/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsIdeas & Deep Analysis

Key Points

  • The article proposes a fully deterministic “control layer” placed in the execution path between AI agents and tools to allow, block, or require approval for each attempted action in real time.
  • It emphasizes security mechanisms beyond prompt/identity/output controls by using credential starvation, session-based risk escalation across an agent’s behavior, and HITL only when risk is high enough to require it.
  • The approach uses “autonomy zones” to vary how much freedom an agent has depending on the environment and the sensitivity of actions (e.g., read-only vs external writes vs sensitive systems).
  • It implements granular, per-tool/per-action enforcement (endpoints, parameters, frequency, and sequence) rather than blanket permissions, and supports a hash-chained, tamper-evident audit log including near-miss attempts.
  • A policy engine drives these decisions in a flexible way (not hardcoded rules) with fast setup (about 10 minutes), and the author is requesting feedback rather than pitching a company.

Most of the current “AI security” stack seems focused on:

• prompts • identities • outputs 

After an agent deleted a prod database on me a year ago. I saw the gap and started building.

a control layer directly in the execution path between agents and tools. We are to market but I don’t want to spam yall with our company so I left it out.

What that actually means

Every time an agent tries to take an action (API call, DB read, file access, etc.), we intercept it and decide in real time:

• allow • block • require approval 

But the important part is how that decision is made.

A few things we’re doing differently

  1. Credential starvation (instead of trusting long-lived access)

Agents don’t get broad, persistent credentials.

They effectively operate with nothing by default, and access is granted per action based on policy + context.

  1. Session-based risk escalation (not stateless checks)

We track behavior across the entire session.

Example:

• one DB read → fine • 20 sequential reads + export → risk escalates • tool chaining → risk escalates 

So decisions aren’t per-call—they’re based on what the agent has been doing over time.

  1. HITL only when it actually matters

We don’t want humans in the loop for everything.

Instead:

• low risk → auto allow • medium risk → maybe constrained • high risk → require approval 

The idea is targeted interruption, not constant friction.

  1. Autonomy zones

Different environments/actions have different trust levels.

Example:

• read-only internal data → low autonomy constraints • external API writes → tighter controls • sensitive systems → very restricted 

Agents can operate freely within a zone, but crossing boundaries triggers stricter enforcement.

  1. Per-tool, per-action control (not blanket policies)

Not just “this agent can use X tool”

More like:

• what endpoints • what parameters • what frequency • in what sequence 

So risk is evaluated at a much more granular level.

  1. Hash-chained audit log (including near-misses)

Every action (allowed, blocked, escalated) is:

• logged • chained • tamper-evident 

Including “almost bad” behavior not just incidents.

This ended up being more useful than expected for understanding agent behavior.

  1. Policy engine (not hardcoded rules)

All of this runs through a policy layer (think flexible rules vs static checks), so behavior can adapt without rewriting code.

  1. Setup is fast (~10 min)

We tried to avoid the “months of integration” problem.

If it’s not easy to sit in the execution path, nobody will actually use it.

Why we think this matters

The failure mode we keep seeing:

agents don’t fail because of one bad prompt —

they fail because of a series of individually reasonable actions that become risky together

Most tooling doesn’t really account for that.

Would love feedback from people actually building agents

• Have you seen agents drift into risky behavior over time? • How are you controlling tool usage today (if at all)? • Does session-level risk make sense, or is that overkill? • Is “credential starvation” realistic in your setups? 

We are just two security guys who built a company not some McKenzie bros who are super funded. We have our first big design partners starting this month and need all these feedback from community as we can get.

submitted by /u/EbbCommon9300
[link] [comments]