Why AI agent teams are just hoping their agents behave

Dev.to / 4/1/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical Usage

共有:

Key Points

The author argues that the AI agent ecosystem is racing to increase autonomy and tool access while neglecting a real runtime control layer that restricts what agents can actually do.
Common “solutions” like better prompts, input validation, or withholding dangerous tools are described as inadequate because they can be bypassed (e.g., via prompt injection) or undermine the purpose of using agents.
The article proposes a dedicated policy/enforcement layer—similar to network firewalls or web application firewalls—that defines boundaries (read/write limits, API/database access, human approval for risky actions) and enforces them automatically.
To address this gap, the author built AgentGuard, an open-source Go proxy that acts as a runtime firewall for agents, with YAML-based policies, blocking/approval workflows, full logging, and integrations/adapters for popular agent frameworks and protocols.

I'm 19, studying computer engineering in Brazil. A few weeks ago I was testing an AI agent with no restrictions. Just to see what it would do.

It was destructive.

Nothing permanent, I caught it. But it was the kind of moment where you sit back and think: what if I hadn't been watching? What if this was running in production? What if someone else's agent is doing this right now and nobody is watching?

That's when I realized the problem. Everyone is racing to give agents more tools, more autonomy, more access. But nobody is building the layer that controls what they can actually do with it. The assumption is that a good prompt is enough. It isn't.

The gap nobody is talking about

The AI agent space has exploded. LangChain, CrewAI, browser-use, OpenAI Agents SDK, the tooling for building agents has never been better. You can have an agent browsing the web, writing code, calling APIs, and moving files in an afternoon.

But here's what I couldn't find: a serious answer to "how do I control what my agent can actually do at runtime?"

The common answers I got:

"Write a good system prompt"
"Add some input validation"
"Just don't give it dangerous tools"

These are not answers. These are hopes dressed up as engineering.

A good system prompt doesn't stop an agent from being manipulated through prompt injection. Input validation doesn't catch an agent that decides rm -rf ./old_stuff is a reasonable interpretation of "clean up." And "don't give it dangerous tools" directly contradicts the reason you're using agents in the first place.

What actually needs to exist

The thing missing is embarrassingly simple: a policy layer that sits between your agent and the world.

Not prompt engineering. Not vibes. An actual enforcement layer that says:

This agent can read from ./workspace but cannot delete anything
This agent can call the OpenAI API but not your production database
This command requires a human to approve it before it executes
Everything gets logged, always

The goal isn't to babysit every action manually, that defeats the purpose of automation. The goal is to define the boundaries once, enforce them automatically, and only surface the genuinely ambiguous decisions to a human.

This is what firewalls do for networks. This is what WAFs do for web apps. Agents need the same thing, and almost nobody is building it.

So I built it

I built AgentGuard, an open source runtime firewall for AI agents.

It's a Go proxy that sits between your agent and its tools. You define policies in YAML. The proxy enforces them in real time, blocking, holding for approval, logging everything. It has adapters for LangChain, CrewAI, browser-use, and MCP. There's a dashboard that shows you live what your agents are doing and lets you approve or deny actions with one click.

It's not finished. The SQLite audit backend isn't done. Some adapters are still rough. But the core works, and I think the core is the right idea.

Caua-ferraz / AgentGuard

AgentGuard is a firewall for AI agents, preventing that any unwanted surprises go without supervision by your agent

The firewall for AI agents.
Policy enforcement, real-time oversight, and full audit logging for autonomous AI systems

Quickstart • Why AgentGuard • Architecture • Policy Engine • Dashboard • Adapters • Setup Guide • Contributing

The Problem

Every trending AI project is giving agents more autonomy — running shell commands, browsing the web, calling APIs, moving money, even performing penetration tests. But nobody is building the guardrails.

Right now, most teams deploying AI agents are just... hoping they behave.

AgentGuard fixes that.

Why AgentGuard

Without AgentGuard	With AgentGuard
Agent runs `rm -rf /` — you find out later	Policy blocks destructive commands before execution
Agent calls production API with no oversight	Action paused, you get a Slack/webhook notification to approve
No record of what the agent did or why	Full audit trail with timestamps, reasoning, and decisions
"It worked on my machine" debugging	Query any agent session from the audit

…

View on GitHub

In 5 days it's been cloned by 165 unique developers with almost no active distribution. I think that says something about how real this problem is.

The thing I keep thinking about

Only 14.4% of organizations send AI agents to production with full security approval. 88% reported confirmed or suspected AI agent security incidents last year.

Everyone is moving fast. Nobody is building the guardrails.

I don't know if AgentGuard is the right answer. But I'm pretty confident "hope" isn't.

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/1DailyView insight →

Black Hat USA

AI Business

Black Hat Asia

AI Business

v0.18.2rc0

vLLM Releases

Claude Code + Telegram: How to Supercharge Your AI Assistant with Voice, Threading & More

Dev.to

South Korean AI Chipmaker Raises $400 Million for Inference

AI Business

Why AI agent teams are just hoping their agents behave

Key Points

The gap nobody is talking about

What actually needs to exist

So I built it

Caua-ferraz / AgentGuard

AgentGuard is a firewall for AI agents, preventing that any unwanted surprises go without supervision by your agent

The Problem

Why AgentGuard

The thing I keep thinking about

💡 Insights using this article

Related Articles

Black Hat USA

Black Hat Asia

v0.18.2rc0

Claude Code + Telegram: How to Supercharge Your AI Assistant with Voice, Threading & More

South Korean AI Chipmaker Raises $400 Million for Inference

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer