NTIRE 2026 Challenge on Video Saliency Prediction: Methods and Results

arXiv cs.CV / 4/17/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • The NTIRE 2026 Challenge on Video Saliency Prediction focused on building automatic methods to predict saliency maps for provided video sequences.
  • Organizers released a new open-licensed dataset of 2,000 diverse videos with fixation-derived saliency supervision collected via crowdsourced mouse tracking from 5,000+ assessors.
  • Evaluation was conducted on 800 test videos using widely accepted quality metrics for saliency prediction.
  • The challenge drew 20+ participating teams, with 7 teams reaching the final stage after code review.
  • All data from the challenge is publicly available, enabling further research and benchmarking: https://github.com/msu-video-group/NTIRE26_Saliency_Prediction.

Abstract

This paper presents an overview of the NTIRE 2026 Challenge on Video Saliency Prediction. The goal of the challenge participants was to develop automatic saliency map prediction methods for the provided video sequences. The novel dataset of 2,000 diverse videos with an open license was prepared for this challenge. The fixations and corresponding saliency maps were collected using crowdsourced mouse tracking and contain viewing data from over 5,000 assessors. Evaluation was performed on a subset of 800 test videos using generally accepted quality metrics. The challenge attracted over 20 teams making submissions, and 7 teams passed the final phase with code review. All data used in this challenge is made publicly available - https://github.com/msu-video-group/NTIRE26_Saliency_Prediction.