TimeMM: Time-as-Operator Spectral Filtering for Dynamic Multimodal Recommendation

arXiv cs.AI / 4/30/2026

💬 OpinionModels & Research

Key Points

  • The paper introduces TimeMM, a time-conditioned spectral filtering framework designed for multimodal recommendation under non-stationary user preferences.
  • It uses the concept of “Time-as-Operator” by converting interaction recency into parametric temporal kernels that reweight edges in a user–item graph without requiring explicit eigendecomposition.
  • To handle differing temporal dynamics, TimeMM adds Adaptive Spectral Filtering that mixes a bank of operators based on temporal context to produce prediction-specific spectral responses.
  • It further proposes Spectral-Aware Modality Routing to adjust how visual and textual signals contribute depending on the same temporal context, plus a Spectral Diversity Regularization to prevent filter-bank collapse.
  • Experiments on real-world benchmarks reportedly show consistent improvements over state-of-the-art multimodal recommenders while keeping linear-time scalability.

Abstract

Multimodal recommendation improves user modeling by integrating collaborative signals with heterogeneous item content. In real applications, user interests evolve over time and exhibit nonstationary dynamics, where different preference factors change at different rates. This challenge is amplified in multimodal settings because visual and textual cues can dominate decisions under different temporal regimes. Despite strong progress, most multimodal recommenders still rely on static interaction graphs or coarse temporal heuristics, which limits their ability to model continuous preference evolution with fine-grained temporal adaptation. To address these limitations, we propose TimeMM, a time-conditioned spectral filtering framework for dynamic multimodal recommendation. TimeMM instantiates Time-as-Operator by mapping interaction recency to a family of parametric temporal kernels that reweight edges on the user--item graph, producing component-specific representations without explicit eigendecomposition. To capture non-stationary interests, we introduce Adaptive Spectral Filtering that mixes the operator bank according to temporal context, yielding prediction-specific effective spectral responses. To account for modality-specific temporal sensitivity, we further propose Spectral-Aware Modality Routing that calibrates visual and textual contributions conditioned on the same temporal context. Finally, a ranking-space Spectral Diversity Regularization encourages complementary expert behaviors and prevents filter-bank collapse. Extensive experiments on real-world benchmarks demonstrate that TimeMM consistently outperforms state-of-the-art multimodal recommenders while maintaining linear-time scalability.