Subjective Portrait Region Cropping in Landscape Videos with Temporal Annotation Smoothing
arXiv cs.CV / 4/29/2026
📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research
Key Points
- The paper addresses challenges in changing landscape videos to different aspect ratios on mobile devices, arguing that static cropping/padding and warping can reduce visual quality or distort intended meaning.
- It proposes temporally coordinated cropping that focuses on important regions while minimizing distortion and preserving essential content across frames.
- To enable research on subjective portrait-region cropping, the authors introduce the LIVE-YT VC dataset (1,800 videos annotated by 90 human subjects), sourced from YouTube-UGC and LSVQ, described as the largest publicly available subjective database for this task.
- They also release a post-processed dataset variant (LIVE-YT VC++) using a new intra-frame temporal filter to smooth subjective annotations, and validate usefulness via SmartVidCrop and fine-tuned state-of-the-art video grounding models.
- The work includes an analysis comparing their labels to video saliency annotations/predictions and plans to open-source the project for benchmarking future research.
Related Articles

How I Use AI Agents to Maintain a Living Knowledge Base for My Team
Dev.to

An API testing tool built specifically for AI agent loops
Dev.to
IK_LLAMA now supports Qwen3.5 MTP Support :O
Reddit r/LocalLLaMA
OpenAI models, Codex, and Managed Agents come to AWS
Dev.to

Indian Developers: How to Build AI Side Income with $0 Capital in 2026
Dev.to