Mixture of Sequence: Theme-Aware Mixture-of-Experts for Long-Sequence Recommendation

arXiv cs.AI / 4/25/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • Sequential recommendation has improved CTR prediction, but long sequences are difficult because users’ interests can shift across time and introduce irrelevant or misleading signals.
  • The paper analyzes long-session behavior and identifies a pattern called “session hopping,” where interests are stable within sessions but can change drastically across sessions and sometimes reappear later.
  • It proposes Mixture of Sequence (MoS), a model-agnostic mixture-of-experts framework that uses theme-aware routing to segment user history into theme-consistent subsequences and filter out misleading information.
  • MoS also adds a multi-scale fusion mechanism with three expert types to capture global trends, short-term behaviors, and theme-specific semantic patterns, improving accuracy while reducing computational cost.
  • Experiments on recommendation tasks show MoS reaches state-of-the-art performance with fewer FLOPs than other MoE methods, and the code is released on GitHub.

Abstract

Sequential recommendation has rapidly advanced in click-through rate prediction due to its ability to model dynamic user interests. A key challenge, however, lies in modeling long sequences: users often exhibit significant interest shifts, introducing substantial irrelevant or misleading information. Our empirical analysis corroborates this challenge and uncovers a recurring behavioral pattern in long sequences (\textit{session hopping}): user interests remain stable within short temporal spans (\textit{sessions}) but shift drastically across sessions and may reappear after multiple sessions. To address this challenge, we propose the Mixture of Sequence (MoS) framework, a model-agnostic MoE approach that achieves accurate predictions by extracting theme-specific and multi-scale subsequences from noisy raw user sequences. First, MoS employs a theme-aware routing mechanism to adaptively learn the latent themes of user sequences and organizes these sequences into multiple coherent subsequences. Each subsequence contains only sessions aligned with a specific theme, thereby effectively filtering out irrelevant or even misleading information introduced by user interest shifts in session hopping. In addition, to alleviate potential information loss, we introduce a multi-scale fusion mechanism, which leverages three types of experts to capture global sequence characteristics, short-term user behaviors, and theme-specific semantic patterns. Together, these two mechanisms endow MoS with the ability to deliver accurate recommendations from multi-faceted and multi-scale perspectives. Experimental results demonstrate that MoS consistently achieves the SOTA performance while introducing fewer FLOPs compared with other MoE counterparts, providing strong evidence of its excellent balance between utility and efficiency. The code is available at https://github.com/xiaolin-cs/MoS.