Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios

arXiv cs.AI / 3/13/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper evaluates autonomous cyber-attack capabilities of frontier AI models on two purpose-built cyber ranges, a 32-step corporate network attack and a 7-step industrial control system attack, requiring long, multi-step capability chaining.
It documents two trends: performance scales log-linearly with inference-time compute (10M to 100M tokens yielding up to 59% gains) with no plateau and no need for operator-specific sophistication.
It shows that newer model generations outperform predecessors at fixed token budgets, with corporate-range progress from 1.7 steps (GPT-4o, Aug 2024) to 9.8 steps (Opus 4.6, Feb 2026), and a best run of 22/32 steps (~6 of 14 hours).
On the industrial control system range, progress is more limited but the latest models reliably complete some steps, averaging 1.2-1.4 of 7 steps (max 3).

Abstract

We evaluate the autonomous cyber-attack capabilities of frontier AI models on two purpose-built cyber ranges-a 32-step corporate network attack and a 7-step industrial control system attack-that require chaining heterogeneous capabilities across extended action sequences. By comparing seven models released over an eighteen-month period (August 2024 to February 2026) at varying inference-time compute budgets, we observe two capability trends. First, model performance scales log-linearly with inference-time compute, with no observed plateau-increasing from 10M to 100M tokens yields gains of up to 59%, requiring no specific technical sophistication from the operator. Second, each successive model generation outperforms its predecessor at fixed token budgets: on the corporate network range, average steps completed at 10M tokens rose from 1.7 (GPT-4o, August 2024) to 9.8 (Opus 4.6, February 2026). The best single run completed 22 of 32 steps, corresponding to roughly 6 of the estimated 14 hours a human expert would need. On the industrial control system range, performance remains limited, though the most recent models are the first to reliably complete steps, averaging 1.2-1.4 of 7 (max 3).

[R] Combining Identity Anchors + Permission Hierarchies achieves 100% refusal in abliterated LLMs — system prompt only, no fine-tuning

Reddit r/MachineLearning

How I Built an AI SDR Agent That Finds Leads and Writes Personalized Cold Emails

Dev.to

Complete Guide: How To Make Money With Ai

Dev.to

I Analyzed My Portfolio with AI and Scored 53/100 — Here's How I Fixed It to 85+

Dev.to

The Demethylation

Dev.to

Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios

Key Points

Abstract

Related Articles

[R] Combining Identity Anchors + Permission Hierarchies achieves 100% refusal in abliterated LLMs — system prompt only, no fine-tuning

How I Built an AI SDR Agent That Finds Leads and Writes Personalized Cold Emails

Complete Guide: How To Make Money With Ai

I Analyzed My Portfolio with AI and Scored 53/100 — Here's How I Fixed It to 85+

The Demethylation

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer