Test-Time Attention Purification for Backdoored Large Vision Language Models
arXiv cs.CV / 3/16/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper analyzes backdoor attacks in large vision-language models and finds that triggers influence predictions by redistributing cross-modal attention, a phenomenon they call attention stealing.
- It introduces CleanSight, a training-free, plug-and-play defense that operates at test time by detecting poisoned inputs via the relative visual-text attention ratio in cross-modal fusion layers and purifying inputs by pruning high-attention visual tokens.
- CleanSight is designed to be training-free and to preserve model utility on both clean and poisoned data, outperforming existing pixel-based purification defenses.
- The work provides extensive experiments across diverse datasets and backdoor attack types, demonstrating the method’s robustness and practical effectiveness.
Related Articles

Astral to Join OpenAI
Dev.to

PearlOS. We gave swarm intelligence a local desktop environment and code control to self-evolve. Has been pretty incredible to see so far. Open source and free if you want your own.
Reddit r/LocalLLaMA

Why Data is Important for LLM
Dev.to

The Inference Market Is Consolidating. Agent Payments Are Still Nobody's Problem.
Dev.to

YouTube's Deepfake Shield for Politicians Changes Evidence Forever
Dev.to