CurEvo: Curriculum-Guided Self-Evolution for Video Understanding

arXiv cs.CV / 4/30/2026

📰 NewsModels & Research

共有:

Key Points

The paper proposes CurEvo, a curriculum-guided self-evolution framework aimed at improving autonomous video understanding without human annotations.
It addresses prior self-evolution approaches that suffer from weak optimization control and unstructured difficulty progression by dynamically regulating task difficulty, evaluation criteria, and data diversity based on model competence.
CurEvo implements a multi-dimensional adaptive QA system that jointly evolves question generation and answer evaluation across perception, recognition, and understanding dimensions to keep curriculum progression coherent and measurable.
Experiments across seven model backbones show consistent gains in benchmark accuracy and evaluator-based semantic scores on four VideoQA benchmarks.
Overall, the work reframes self-evolution as a feedback loop that aligns learning complexity with the model’s current capability, making improvement more reliable and structured.

Abstract

Recent advances in self-evolution video understanding frameworks have demonstrated the potential of autonomous learning without human annotations. However, existing methods often suffer from weakly controlled optimization and uncontrolled difficulty progression, as they lack structured guidance throughout the iterative learning process. To address these limitations, we propose CurEvo, a curriculum-guided self-evolution framework that introduces curriculum learning into self-evolution to achieve more structured and progressive model improvement. CurEvo dynamically regulates task difficulty, refines evaluation criteria, and balances data diversity according to model competence, forming a curriculum-guided feedback loop that aligns learning complexity with model capability. Built upon this principle, we develop a multi-dimensional adaptive QA framework that jointly evolves question generation and answer evaluation across perception, recognition, and understanding dimensions, ensuring coherent and measurable curriculum progression. Through this integration, CurEvo transforms weakly controlled self-evolution into a more structured learning process for autonomous video understanding. Across seven backbones, CurEvo consistently improves both benchmark accuracy and evaluator-based semantic score on four VideoQA benchmarks, validating the effectiveness of curriculum-guided self-evolution for video understanding.

Vector DB and ANN vs PHE conflict, is there a practical workaround? [D]

Reddit r/MachineLearning

Azure Weekly: Microsoft and OpenAI Restructure Partnership as GPT-5.5 Lands in Foundry

Dev.to

Vibe coding is a tool, not a shortcut. Most people are using it wrong.

Dev.to

Automating YouTube Content Creation with Artificial Intelligence

Dev.to

Memento: Fine-tuning LLM Agents without Fine-tuning LLMs

Dev.to

CurEvo: Curriculum-Guided Self-Evolution for Video Understanding

Key Points

Abstract

Related Articles

Vector DB and ANN vs PHE conflict, is there a practical workaround? [D]

Azure Weekly: Microsoft and OpenAI Restructure Partnership as GPT-5.5 Lands in Foundry

Vibe coding is a tool, not a shortcut. Most people are using it wrong.

Automating YouTube Content Creation with Artificial Intelligence

Memento: Fine-tuning LLM Agents without Fine-tuning LLMs

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer