ALL Agents deviate, fail and mess up because no enforcement is done at runtime. A method to fix it.

Reddit r/artificial / 4/26/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

共有:

Key Points

The post argues that LLM agents frequently drift or violate “system prompt” rules because prompt-based constraints are treated as suggestions rather than enforced guarantees.
It highlights concrete failure scenarios (e.g., attempting forbidden actions like deleting data, leaking internal pricing/cost basis, skipping identity verification, and dropping earlier rules when more are added).
The author proposes a provider-agnostic proxy layer between the application and the LLM that reads rules from a simple Markdown file and enforces them at runtime.
The method is claimed to work with multiple agent frameworks (such as LangGraph/CrewAI/custom) using only a base URL change, reducing the need for unreliable re-prompting or post-hoc evaluation.
Readers are invited to share what they do in production today (e.g., shadow evals or reprompt loops) to address agent noncompliance.

ALL Agents deviate, fail and mess up because no enforcement is done at runtime. A method to fix it.

I have been following this and many other subs around LLMs and Agents, everything from the top posts to recent are regarding agents going off and doing something they are not supposed to do, drift and ignore the system prompts. Real examples:

"Never delete user data" → agent calls DROP TABLE users next turn
"Don't share internal pricing" → agent leaks cost basis to a customer
"Verify identity first" → agent skips to the action
Add 10 more rules → model quietly drops the first 5

I am 100% sure if you have used Agents in prod, this has occurred to you (especially when your system prompts get larger, and context gets bigger). You can test this yourself and notice immediate enforcement.

Prompt-based rules are suggestions, not constraints. Re-prompting fixes one case, breaks two. Post-hoc evals tell you what already went wrong. NeMo and Guardrails AI help on content safety but don't cover business logic/your specification.

After tackling this from a few angles, I finally got something solid. A proxy system between your app and your LLM, which reads rules from a plain markdown, enforces at runtime. Provider-agnostic, one base URL change, works with LangGraph/CrewAI/custom.

- Maximum discount is 15%. - Never reveal internal pricing or cost basis.

Without it: agent offers 90% off and mentions your margin. With it: 15%, no margin talk.

Curious if it solved your LLMs for outputting incorrect stuff or agents from going off tracks, it definitely did for my (specific) use cases.

What's everyone doing for this in prod? Shadow evals? Re-prompt loops? Something I'm missing?

submitted by /u/Chinmay101202
[link] [comments]

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/26DailyView insight →

Black Hat USA

AI Business

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

How I tracked which AI bots actually crawl my site

Dev.to

Hijacking OpenClaw with Claude

Dev.to

How I Replaced WordPress, Shopify, and Mailchimp with Cloudflare Workers

Dev.to

ALL Agents deviate, fail and mess up because no enforcement is done at runtime. A method to fix it.

Key Points

💡 Insights using this article

Related Articles

Black Hat USA

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

How I tracked which AI bots actually crawl my site

Hijacking OpenClaw with Claude

How I Replaced WordPress, Shopify, and Mailchimp with Cloudflare Workers

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer