Adversarial Arena: Crowdsourcing Data Generation through Interactive Competition

arXiv cs.AI / 4/21/2026

📰 NewsIndustry & Market MovesModels & Research

共有:

Key Points

The paper proposes Adversarial Arena, a new way to generate high-quality multi-turn conversational data for post-training large language models by turning dataset creation into an interactive adversarial game.
In the setup, multiple teams act as attackers (creating prompts) and defenders (generating responses), which helps produce data that is more diverse and complex than typical crowdsourcing or purely synthetic methods.
The authors ran a competition with 10 top US and European academic teams, yielding 19,683 multi-turn conversations focused on LLM safety alignment in cybersecurity.
Fine-tuning an open-source model on the resulting dataset led to measurable gains in secure code generation, improving scores by 18.47% on CyberSecEval-Instruct and 29.42% on CyberSecEval-MITRE.

Abstract

Post-training Large Language Models requires diverse, high-quality data which is rare and costly to obtain, especially in low resource domains and for multi-turn conversations. Common solutions are crowdsourcing or synthetic generation, but both often yield low-quality or low-diversity data. We introduce Adversarial Arena for building high quality conversational datasets by framing data generation as an adversarial task: attackers create prompts, and defenders generate responses. This interactive competition between multiple teams naturally produces diverse and complex data. We validated this approach by conducting a competition with 10 academic teams from top US and European universities, each building attacker or defender bots. The competition, focused on safety alignment of LLMs in cybersecurity, generated 19,683 multi-turn conversations. Fine-tuning an open-source model on this dataset produced an 18.47% improvement in secure code generation on CyberSecEval-Instruct and 29.42% improvement on CyberSecEval-MITRE.

Black Hat USA

AI Business

Magnificent irony as Meta staff unhappy about running surveillance software on work PCs

The Register

ETHENEA (ETHENEA Americas LLC) Analyst View: Asset Allocation Resilience in the 2026 Global Macro Cycle

Dev.to

Best AI Khata App for Kirana Stores in India (2026) – Dukaan AI Review

Dev.to

DEEPX and Hyundai Are Building Generative AI Robots

Dev.to

Adversarial Arena: Crowdsourcing Data Generation through Interactive Competition

Key Points

Abstract

Related Articles

Black Hat USA

Magnificent irony as Meta staff unhappy about running surveillance software on work PCs

ETHENEA (ETHENEA Americas LLC) Analyst View: Asset Allocation Resilience in the 2026 Global Macro Cycle

Best AI Khata App for Kirana Stores in India (2026) – Dukaan AI Review

DEEPX and Hyundai Are Building Generative AI Robots

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer