AttentionBender: Manipulating Cross-Attention in Video Diffusion Transformers as a Creative Probe

arXiv cs.CV / 4/24/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The paper introduces AttentionBender, a tool that manipulates cross-attention in Video Diffusion Transformers to let artists explore how black-box video generation actually works.
Because prompt-only control is limited, the authors use a “research-through-design” approach building on Network Bending to apply 2D transforms to cross-attention maps, modulating what the model generates.
Experiments visualize 4,500+ video generations while varying prompts, attention-map operations, and target layers to evaluate controllability.
The findings indicate cross-attention is strongly entangled, meaning targeted edits often don’t stay localized and instead create distributed distortions and glitch-like aesthetics rather than clean, direct changes.
AttentionBender is positioned both as an Explainable AI-style probe of transformer attention mechanisms and as a creative method to generate aesthetics outside the model’s default learned representational space.

Abstract

We present AttentionBender, a tool that manipulates cross-attention in Video Diffusion Transformers to help artists probe the internal mechanics of black-box video generation. While generative outputs are increasingly realistic, prompt-only control limits artists' ability to build intuition for the model's material process or to work beyond its default tendencies. Using an autobiographical research-through-design approach, we built on Network Bending to design AttentionBender, which applies 2D transforms (rotation, scaling, translation, etc.) to cross-attention maps to modulate generation. We assess AttentionBender by visualizing 4,500+ video generations across prompts, operations, and layer targets. Our results suggest that cross-attention is highly entangled: targeted manipulations often resist clean, localized control, producing distributed distortions and glitch aesthetics over linear edits. AttentionBender contributes a tool that functions both as an Explainable AI style probe of transformer attention mechanisms, and as a creative technique for producing novel aesthetics beyond the model's learned representational space.

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/24DailyView insight →

Black Hat USA

AI Business

The 67th Attempt: When Your "Knowledge Management" System Becomes a Self-Fulfilling Prophecy of Excellence

Dev.to

Context Engineering for Developers: A Practical Guide (2026)

Dev.to

GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.

Dev.to

AI Visibility Tracking Exploded in 2026: 6 Tools Every Brand Needs Now

Dev.to

AttentionBender: Manipulating Cross-Attention in Video Diffusion Transformers as a Creative Probe

Key Points

Abstract

💡 Insights using this article

Related Articles

Black Hat USA

The 67th Attempt: When Your "Knowledge Management" System Becomes a Self-Fulfilling Prophecy of Excellence

Context Engineering for Developers: A Practical Guide (2026)

GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.

AI Visibility Tracking Exploded in 2026: 6 Tools Every Brand Needs Now

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer