I'm 19, studying computer engineering in Brazil. A few weeks ago I was testing an AI agent with no restrictions. Just to see what it would do.
It was destructive.
Nothing permanent, I caught it. But it was the kind of moment where you sit back and think: what if I hadn't been watching? What if this was running in production? What if someone else's agent is doing this right now and nobody is watching?
That's when I realized the problem. Everyone is racing to give agents more tools, more autonomy, more access. But nobody is building the layer that controls what they can actually do with it. The assumption is that a good prompt is enough. It isn't.
The gap nobody is talking about
The AI agent space has exploded. LangChain, CrewAI, browser-use, OpenAI Agents SDK, the tooling for building agents has never been better. You can have an agent browsing the web, writing code, calling APIs, and moving files in an afternoon.
But here's what I couldn't find: a serious answer to "how do I control what my agent can actually do at runtime?"
The common answers I got:
- "Write a good system prompt"
- "Add some input validation"
- "Just don't give it dangerous tools"
These are not answers. These are hopes dressed up as engineering.
A good system prompt doesn't stop an agent from being manipulated through prompt injection. Input validation doesn't catch an agent that decides rm -rf ./old_stuff is a reasonable interpretation of "clean up." And "don't give it dangerous tools" directly contradicts the reason you're using agents in the first place.
What actually needs to exist
The thing missing is embarrassingly simple: a policy layer that sits between your agent and the world.
Not prompt engineering. Not vibes. An actual enforcement layer that says:
- This agent can read from
./workspacebut cannot delete anything - This agent can call the OpenAI API but not your production database
- This command requires a human to approve it before it executes
- Everything gets logged, always
The goal isn't to babysit every action manually, that defeats the purpose of automation. The goal is to define the boundaries once, enforce them automatically, and only surface the genuinely ambiguous decisions to a human.
This is what firewalls do for networks. This is what WAFs do for web apps. Agents need the same thing, and almost nobody is building it.
So I built it
I built AgentGuard, an open source runtime firewall for AI agents.
It's a Go proxy that sits between your agent and its tools. You define policies in YAML. The proxy enforces them in real time, blocking, holding for approval, logging everything. It has adapters for LangChain, CrewAI, browser-use, and MCP. There's a dashboard that shows you live what your agents are doing and lets you approve or deny actions with one click.
It's not finished. The SQLite audit backend isn't done. Some adapters are still rough. But the core works, and I think the core is the right idea.
Caua-ferraz
/
AgentGuard
AgentGuard is a firewall for AI agents, preventing that any unwanted surprises go without supervision by your agent
The firewall for AI agents.
Policy enforcement, real-time oversight, and full audit logging for autonomous AI systems
Quickstart • Why AgentGuard • Architecture • Policy Engine • Dashboard • Adapters • Setup Guide • Contributing
The Problem
Every trending AI project is giving agents more autonomy — running shell commands, browsing the web, calling APIs, moving money, even performing penetration tests. But nobody is building the guardrails.
Right now, most teams deploying AI agents are just... hoping they behave.
AgentGuard fixes that.
Why AgentGuard
| Without AgentGuard | With AgentGuard |
|---|---|
Agent runs rm -rf / — you find out later |
Policy blocks destructive commands before execution |
| Agent calls production API with no oversight | Action paused, you get a Slack/webhook notification to approve |
| No record of what the agent did or why | Full audit trail with timestamps, reasoning, and decisions |
| "It worked on my machine" debugging | Query any agent session from the audit |
In 5 days it's been cloned by 165 unique developers with almost no active distribution. I think that says something about how real this problem is.
The thing I keep thinking about
Only 14.4% of organizations send AI agents to production with full security approval. 88% reported confirmed or suspected AI agent security incidents last year.
Everyone is moving fast. Nobody is building the guardrails.
I don't know if AgentGuard is the right answer. But I'm pretty confident "hope" isn't.



