VQ-Jarvis: Retrieval-Augmented Video Restoration Agent with Sharp Vision and Fast Thought

arXiv cs.CV / 3/25/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces VQ-Jarvis, a retrieval-augmented video restoration agent aimed at handling heterogeneous real-world degradations better than fixed pipelines.
It proposes “sharp vision” via VSR-Compare, a large-scale paired video enhancement dataset (20K comparison pairs) spanning 7 degradation types and 11 enhancement operators.
VQ-Jarvis uses trained judge and degradation-perception models to distinguish subtle quality differences among candidate restorations and to guide agent decisions.
For “fast thought,” it combines one-step retrieval for easier videos with hierarchical step-by-step greedy search for more difficult cases to balance efficiency and accuracy.
Experiments reported in the article indicate that VQ-Jarvis outperforms existing video restoration approaches on complex degraded videos.

Abstract

Video restoration in real-world scenarios is challenged by heterogeneous degradations, where static architectures and fixed inference pipelines often fail to generalize. Recent agent-based approaches offer dynamic decision making, yet existing video restoration agents remain limited by insufficient quality perception and inefficient search strategies. We propose VQ-Jarvis, a retrieval-augmented, all-in-one intelligent video restoration agent with sharper vision and faster thought. VQ-Jarvis is designed to accurately perceive degradations and subtle differences among paired restoration results, while efficiently discovering optimal restoration trajectories. To enable sharp vision, we construct VSR-Compare, the first large-scale video paired enhancement dataset with 20K comparison pairs covering 7 degradation types, 11 enhancement operators, and diverse content domains. Based on this dataset, we train a multiple operator judge model and a degradation perception model to guide agent decisions. To achieve fast thought, we introduce a hierarchical operator scheduling strategy that adapts to video difficulty: for easy cases, optimal restoration trajectories are retrieved in a one-step manner from a retrieval-augmented generation (RAG) library; for harder cases, a step-by-step greedy search is performed to balance efficiency and accuracy. Extensive experiments demonstrate that VQ-Jarvis consistently outperforms existing methods on complex degraded videos.

Black Hat Asia

AI Business

"The Agent Didn't Decide Wrong. The Instructions Were Conflicting — and Nobody Noticed."

Dev.to

Top 5 LLM Gateway Alternatives After the LiteLLM Supply Chain Attack

Dev.to

Stop Counting Prompts — Start Reflecting on AI Fluency

Dev.to

Reliable Function Calling in Deeply Recursive Union Types: Fixing Qwen Models' Double-Stringify Bug

Dev.to

VQ-Jarvis: Retrieval-Augmented Video Restoration Agent with Sharp Vision and Fast Thought

Key Points

Abstract

Related Articles

Black Hat Asia

"The Agent Didn't Decide Wrong. The Instructions Were Conflicting — and Nobody Noticed."

Top 5 LLM Gateway Alternatives After the LiteLLM Supply Chain Attack

Stop Counting Prompts — Start Reflecting on AI Fluency

Reliable Function Calling in Deeply Recursive Union Types: Fixing Qwen Models' Double-Stringify Bug

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer