Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios

arXiv cs.AI / 3/13/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper evaluates autonomous cyber-attack capabilities of frontier AI models on two purpose-built cyber ranges, a 32-step corporate network attack and a 7-step industrial control system attack, requiring long, multi-step capability chaining.
It documents two trends: performance scales log-linearly with inference-time compute (10M to 100M tokens yielding up to 59% gains) with no plateau and no need for operator-specific sophistication.
It shows that newer model generations outperform predecessors at fixed token budgets, with corporate-range progress from 1.7 steps (GPT-4o, Aug 2024) to 9.8 steps (Opus 4.6, Feb 2026), and a best run of 22/32 steps (~6 of 14 hours).
On the industrial control system range, progress is more limited but the latest models reliably complete some steps, averaging 1.2-1.4 of 7 steps (max 3).

Abstract

We evaluate the autonomous cyber-attack capabilities of frontier AI models on two purpose-built cyber ranges-a 32-step corporate network attack and a 7-step industrial control system attack-that require chaining heterogeneous capabilities across extended action sequences. By comparing seven models released over an eighteen-month period (August 2024 to February 2026) at varying inference-time compute budgets, we observe two capability trends. First, model performance scales log-linearly with inference-time compute, with no observed plateau-increasing from 10M to 100M tokens yields gains of up to 59%, requiring no specific technical sophistication from the operator. Second, each successive model generation outperforms its predecessor at fixed token budgets: on the corporate network range, average steps completed at 10M tokens rose from 1.7 (GPT-4o, August 2024) to 9.8 (Opus 4.6, February 2026). The best single run completed 22 of 32 steps, corresponding to roughly 6 of the estimated 14 hours a human expert would need. On the industrial control system range, performance remains limited, though the most recent models are the first to reliably complete steps, averaging 1.2-1.4 of 7 (max 3).

The programming passion is melting

Dev.to

Maximize Developer Revenue with Monetzly's Innovative API for AI Conversations

Dev.to

Co-Activation Pattern Detection for Prompt Injection: A Mechanistic Interpretability Approach Using Sparse Autoencoders

Reddit r/LocalLLaMA

Nvidia GTC 2026: Jensen Huang Bets $1 Trillion on the Age of the AI Factory

Dev.to

How to Train Custom Language Models: Fine-Tuning vs Training From Scratch (2026)

Dev.to

Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios

Key Points

Abstract

Related Articles

The programming passion is melting

Maximize Developer Revenue with Monetzly's Innovative API for AI Conversations

Co-Activation Pattern Detection for Prompt Injection: A Mechanistic Interpretability Approach Using Sparse Autoencoders

Nvidia GTC 2026: Jensen Huang Bets $1 Trillion on the Age of the AI Factory

How to Train Custom Language Models: Fine-Tuning vs Training From Scratch (2026)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer