Arc Gate —LLM proxy that hits P=1.00 R=1.00 F1=1.00 on indirect/roleplay prompt injection (beats OpenAI Moderation and LlamaGuard)

Reddit r/artificial / 4/29/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageIndustry & Market Moves

共有:

Key Points

Arc Gate is an LLM proxy that detects indirect prompt injection and roleplay-style framings with benchmark results of P=1.00, R=1.00, and F1=1.00 on 40 out-of-distribution prompts.
In the reported tests, Arc Gate outperformed OpenAI Moderation (P=1.00, R=0.75, F1=0.86) and LlamaGuard 3 8B (P=1.00, R=0.55, F1=0.71) by achieving both zero false positives and zero misses.
The system claims to block malicious prompts before they reach your model, adding roughly ~350ms detection overhead on top of upstream latency, with an average block time of 329ms.
Arc Gate can be placed in front of any OpenAI-compatible endpoint, requires no GPU on the user’s side, and is configured via a single environment variable, with a GitHub repo and live dashboard provided.
The project positions itself as a practical, low-infrastructure mitigation layer for prompt-injection risks across existing LLM applications.

Benchmarked on 40 out-of-distribution prompts, indirect requests, roleplay framings, hypothetical scenarios, technical phrasings. The stuff that slips past everything else.

Arc Gate: P=1.00, R=1.00, F1=1.00

OpenAI Moderation API: P=1.00, R=0.75, F1=0.86

LlamaGuard 3 8B: P=1.00, R=0.55, F1=0.71

Zero false positives. Zero misses. Blocked prompts average 329ms and never reach your model. Detection overhead is ~350ms on top of your normal upstream latency.

Sits in front of any OpenAI-compatible endpoint. No GPU on your side. One env var to configure.

GitHub: https://github.com/9hannahnine-jpg/arc-gate

Live dashboard: https://web-production-6e47f.up.railway.app/dashboard

Happy to answer questions.

submitted by /u/Turbulent-Tap6723
[link] [comments]