T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search

arXiv cs.AI / 3/25/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that traditional LLM red-teaming methods miss agent-specific vulnerabilities that only appear during multi-step tool use, especially in tool ecosystems like Model Context Protocol (MCP).
It introduces T-MAP, a trajectory-aware evolutionary search technique that uses execution trajectories to systematically generate adversarial prompts and attack paths.
T-MAP can automatically produce attacks that bypass safety guardrails while still achieving harmful objectives through real tool interactions, not just harmful text.
Experiments across multiple MCP environments show T-MAP significantly improves attack realization rate (ARR) versus baselines and remains effective against multiple frontier models, including GPT-5.2, Gemini-3-Pro, Qwen3.5, and GLM-5.
The findings suggest autonomous LLM agents have underexplored security weaknesses tied to tool-execution trajectories and agent behavior over time.

Abstract

While prior red-teaming efforts have focused on eliciting harmful text outputs from large language models (LLMs), such approaches fail to capture agent-specific vulnerabilities that emerge through multi-step tool execution, particularly in rapidly growing ecosystems such as the Model Context Protocol (MCP). To address this gap, we propose a trajectory-aware evolutionary search method, T-MAP, which leverages execution trajectories to guide the discovery of adversarial prompts. Our approach enables the automatic generation of attacks that not only bypass safety guardrails but also reliably realize harmful objectives through actual tool interactions. Empirical evaluations across diverse MCP environments demonstrate that T-MAP substantially outperforms baselines in attack realization rate (ARR) and remains effective against frontier models, including GPT-5.2, Gemini-3-Pro, Qwen3.5, and GLM-5, thereby revealing previously underexplored vulnerabilities in autonomous LLM agents.

Santa Augmentcode Intent Ep.6

Dev.to

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.

Dev.to

ClawRouter vs TeamoRouter: one requires a crypto wallet, one doesn't

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Palantir’s billionaire CEO says only two kinds of people will succeed in the AI era: trade workers — ‘or you’re neurodivergent’

Reddit r/artificial

T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search

Key Points

Abstract

Related Articles

Santa Augmentcode Intent Ep.6

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.

ClawRouter vs TeamoRouter: one requires a crypto wallet, one doesn't

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Palantir’s billionaire CEO says only two kinds of people will succeed in the AI era: trade workers — ‘or you’re neurodivergent’

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer