Steerable Adversarial Scenario Generation through Test-Time Preference Alignment
arXiv cs.RO / 5/6/2026
💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- The paper reframes adversarial scenario generation for autonomous driving safety as a multi-objective preference alignment problem, addressing the limitation of existing methods that rely on a single fixed trade-off between adversariality and realism.
- It introduces SAGE (Steerable Adversarial scenario GEnerator), which allows fine-grained control of the adversariality–realism balance at test time without any retraining.
- SAGE uses hierarchical group-based offline preference optimization to learn balanced behavior by separating hard feasibility constraints from soft preferences, improving data efficiency.
- Rather than producing a single fixed model, SAGE fine-tunes two expert models with opposing preferences and creates a continuous range of policies during inference via linear interpolation of their weights.
- Experiments and theory (via linear mode connectivity) show SAGE can generate better-balanced scenarios and also supports more effective closed-loop training of driving policies.
Related Articles

Black Hat USA
AI Business

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide
Dev.to

PaioClaw Review: What You Actually Get for $15/mo vs DIY OpenClaw
Dev.to

PaioClaw Review: What You Actually Get for $15/mo vs DIY OpenClaw
Dev.to

SIFS (SIFS Is Fast Search) - local code search for coding agents
Dev.to