QVAD: A Question-Centric Agentic Framework for Efficient and Training-Free Video Anomaly Detection
arXiv cs.CV / 4/6/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces QVAD, a question-centric agentic framework for training-free video anomaly detection that replaces static prompts with an iterative dialogue between an LLM and a VLM.
- QVAD uses “prompt-updating” based on visual context so that smaller VLMs can generate high-fidelity captions and more precise semantic reasoning without updating model parameters.
- The approach is reported to reach state-of-the-art performance on multiple benchmarks (UCF-Crime, XD-Violence, and UBNormal) while using a fraction of the parameters compared with competing methods.
- QVAD is also claimed to generalize well to the single-scene ComplexVAD dataset, indicating robustness beyond the training/testing setup.
- The framework is presented as fast at inference with low memory usage, targeting deployment on resource-constrained edge devices.
Related Articles

Black Hat Asia
AI Business

Оказывается, эта нейросеть рисует бесплатно. Я узнал случайно.
Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Three-Layer Memory Governance: Core, Provisional, Private
Dev.to

I Researched AI Prompting So You Don’t Have To
Dev.to