AI Navigate

Graph2Video: Leveraging Video Models to Model Dynamic Graph Evolution

arXiv cs.CV / 3/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • Graph2Video introduces a video-inspired framework that treats the temporal neighborhood of a target link as a sequence of graph frames, forming a graph video for link prediction.
  • By borrowing inductive biases from video foundation models, it aims to capture both fine-grained local variations and long-range temporal dynamics in dynamic graphs.
  • The method generates a link-level embedding that serves as a lightweight, plug-and-play link-centric memory unit that can integrate into existing dynamic graph encoders.
  • Experiments on benchmark datasets show Graph2Video outperforms state-of-the-art baselines on link prediction in most cases, underscoring the potential of applying video modeling techniques to dynamic graph learning.

Abstract

Dynamic graphs are common in real-world systems such as social media, recommender systems, and traffic networks. Existing dynamic graph models for link prediction often fall short in capturing the complexity of temporal evolution. They tend to overlook fine-grained variations in temporal interaction order, struggle with dependencies that span long time horizons, and offer limited capability to model pair-specific relational dynamics. To address these challenges, we propose \textbf{Graph2Video}, a video-inspired framework that views the temporal neighborhood of a target link as a sequence of "graph frames". By stacking temporally ordered subgraph frames into a "graph video", Graph2Video leverages the inductive biases of video foundation models to capture both fine-grained local variations and long-range temporal dynamics. It generates a link-level embedding that serves as a lightweight and plug-and-play link-centric memory unit. This embedding integrates seamlessly into existing dynamic graph encoders, effectively addressing the limitations of prior approaches. Extensive experiments on benchmark datasets show that Graph2Video outperforms state-of-the-art baselines on the link prediction task in most cases. The results highlight the potential of borrowing spatio-temporal modeling techniques from computer vision as a promising and effective approach for advancing dynamic graph learning.