A Low-Latency Fraud Detection Layer for Detecting Adversarial Interaction Patterns in LLM-Powered Agents

arXiv cs.AI / 5/5/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces a low-latency fraud detection layer to identify adversarial interaction patterns that can manipulate LLM-powered agents over multi-turn sessions, beyond single-prompt filtering.
Instead of classifying individual prompts as malicious, the method models risk across interaction trajectories using structured runtime features from prompt traits, session dynamics, tool usage, execution context, and fraud-inspired signals.
The authors propose implementing the detector with lightweight models for real-time deployment, aiming to complement (not replace) existing prompt-level defenses and rule-based guardrails.
Evaluation uses a synthetic dataset of 12,000 multi-turn agent interactions and a 42-feature setup with an XGBoost classifier, achieving over 9× faster detection than LLM-based detectors.
The study concludes that interaction-level (trajectory) behavioral detection should be a core component of deployment-time security for LLM agents.

Abstract

Large Language Model (LLM)-powered agents demonstrate strong capabilities in autonomous task execution, tool use, and multi-step reasoning. However, their increasing autonomy also introduces a new attack surface: adversarial interactions can manipulate agent behavior through direct prompt injection, indirect content attacks, and multi-turn escalation strategies. Existing defense strategies focus on prompt-level filtering and rule-based guardrails, which are often insufficient when risk emerges gradually across interaction sequences. In this work, we propose a complementary defense mechanism: a low-latency fraud detection layer for detecting adversarial interaction patterns in LLM-powered agents. Instead of determining whether a single prompt is malicious, our approach models risk over interaction trajectories using structured runtime features derived from prompt characteristics, session dynamics, tool usage, execution context, and fraud-inspired signals. The detection layer can be implemented using lightweight models leading to low-latency real-time deployments. To evaluate the framework, we construct a synthetic corpus of 12,000 multi-turn agent interactions generated from parameterized templates that simulate realistic agentic workflows. Using 42 structured features and an XGBoost classifier, our detector achieves over 9 times faster than LLM-based detectors. Through the experiment and ablation studies, our work suggests that interaction-level behavioral detection should become a core component of deployment-time defense for LLM-powered agents.

When Claims Freeze Because a Provider Record Drifted: The Case for Enrollment Repair Agents

Dev.to

The Cash Is Already Earned: Why Construction Pay Application Exceptions Fit an Agent Better Than SaaS

Dev.to

Why Ship-and-Debit Claim Recovery Is a Better Agent Wedge Than Another “AI Back Office” Tool

Dev.to

AI is getting better at doing things, but still bad at deciding what to do?

Reddit r/artificial

I Built an AI-Powered Chinese BaZi (八字) Fortune Teller — Here's What DeepSeek Revealed About Destiny

Dev.to

A Low-Latency Fraud Detection Layer for Detecting Adversarial Interaction Patterns in LLM-Powered Agents

Key Points

Abstract

Related Articles

When Claims Freeze Because a Provider Record Drifted: The Case for Enrollment Repair Agents

The Cash Is Already Earned: Why Construction Pay Application Exceptions Fit an Agent Better Than SaaS

Why Ship-and-Debit Claim Recovery Is a Better Agent Wedge Than Another “AI Back Office” Tool

AI is getting better at doing things, but still bad at deciding what to do?

I Built an AI-Powered Chinese BaZi (八字) Fortune Teller — Here's What DeepSeek Revealed About Destiny

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer