Built a prompt injection proxy that beats OpenAI Moderation and LlamaGuard — see it block attacks live

Reddit r/artificial / 4/30/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • Arc Gate is a proxy layer that sits in front of any OpenAI-compatible endpoint to block prompt-injection attempts before they reach the model.
  • The system uses a multi-layer detection approach, including a behavioral SVM built on sentence-transformer embeddings to catch semantic intent beyond simple phrase pattern matching.
  • In benchmarking on 40 hard out-of-distribution prompts, Arc Gate achieved higher recall and F1 scores than OpenAI Moderation and LlamaGuard 3 8B.
  • The project reports zero false positives on benign prompts (including security discussions and safe roleplay) with an average block latency of 329ms, and provides an integration option via a single URL change.
  • Users can try the service instantly via a public URL and integrate it into their own projects using the provided base_url parameter, with the code hosted on GitHub.

Built Arc Gate — sits in front of any OpenAI-compatible endpoint and blocks prompt injection before it reaches your model.

Try it here — no signup, no code, no setup:

https://web-production-6e47f.up.railway.app/try

Type any prompt and see if it gets blocked or passes. The examples on the page show the difference.

The main detection layer is a behavioral SVM on sentence-transformer embeddings — catches semantic intent, not just pattern matches. Phrase matching is just the fast first pass. Four layers total.

Benchmarked on 40 OOD prompts (indirect, roleplay, hypothetical framings — the hard stuff):

• Arc Gate: Recall 0.90, F1 0.947 • OpenAI Moderation: Recall 0.75, F1 0.86 • LlamaGuard 3 8B: Recall 0.55, F1 0.71 

Zero false positives on benign prompts including security discussions and safe roleplay. Block latency 329ms.

One URL change to integrate into your own project:

base_url=“https://web-production-6e47f.up.railway.app/v1”

GitHub: github.com/9hannahnine-jpg/arc-gate — star if useful.

submitted by /u/Turbulent-Tap6723
[link] [comments]