Multi-Stream Perturbation Attack: Breaking Safety Alignment of Thinking LLMs Through Concurrent Task Interference
arXiv cs.AI / 3/12/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper identifies vulnerabilities in the thinking mode of LLMs when processing interleaved multiple tasks, highlighting new safety risks.
- It introduces the multi-stream perturbation attack, which interleaves multiple task streams within a single prompt to create interference, along with three perturbation strategies: multi-stream interleaving, inversion perturbation, and shape transformation.
- Experiments on JailbreakBench, AdvBench, and HarmBench show the attack achieving high success rates across models such as Qwen3 series, DeepSeek, Qwen3-Max, and Gemini 2.5 Flash, with thinking collapse up to 17% and response repetition up to 60%.
- The results indicate that thinking-mode based safety mechanisms can be bypassed and that concurrent task interference can degrade model thinking, underscoring safety implications for current and future LLM deployments.
Related Articles

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO
Dev.to

How to Build Passive Income with AI in 2026: A Developer's Practical Guide
Dev.to

The Research That Doesn't Exist
Dev.to

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI
TechCrunch

Krish Naik: AI Learning Path For 2026- Data Science, Generative and Agentic AI Roadmap
Dev.to