AI Navigate

Accurate and Efficient Multi-Channel Time Series Forecasting via Sparse Attention Mechanism

arXiv cs.AI / 3/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • Li-Net introduces Linear-Network, a novel architecture for multi-channel time series forecasting that captures both linear and non-linear dependencies among channels.
  • It dynamically compresses representations across sequence and channel dimensions and passes them through a configurable non-linear module before reconstructing forecasts.
  • The approach integrates a sparse Top-K Softmax attention mechanism within a multi-scale projection framework to focus on the most informative time steps and features, enabling efficient computation.
  • It supports fusion of multi-modal embeddings to guide the sparse attention and enhance cross-channel information integration.
  • Experimental results on real-world benchmarks show Li-Net achieves competitive accuracy while using significantly less memory and delivering faster inference than state-of-the-art baselines, with ablation studies validating each component.

Abstract

The task of multi-channel time series forecasting is ubiquitous in numerous fields such as finance, supply chain management, and energy planning. It is critical to effectively capture complex dynamic dependencies within and between channels for accurate predictions. However, traditional method paid few attentions on learning the interaction among channels. This paper proposes Linear-Network (Li-Net), a novel architecture designed for multi-channel time series forecasting that captures the linear and non-linear dependencies among channels. Li-Net dynamically compresses representations across sequence and channel dimensions, processes the information through a configurable non-linear module and subsequently reconstructs the forecasts. Moreover, Li-Net integrates a sparse Top-K Softmax attention mechanism within a multi-scale projection framework to address these challenges. A core innovation is its ability to seamlessly incorporate and fuse multi-modal embeddings, guiding the sparse attention process to focus on the most informative time steps and feature channels. Through the experiment results on multiple real-world benchmark datasets demonstrate that Li-Net achieves competitive performance compared to state-of-the-art baseline methods. Furthermore, Li-Net provides a superior balance between prediction accuracy and computational burden, exhibiting significantly lower memory usage and faster inference times. Detailed ablation studies and parameter sensitivity analyses validate the effectiveness of each key component in our proposed architecture. Keywords: Multivariate Time Series Forecasting, Sparse Attention Mechanism, Multimodal Information Fusion, Non-linear relationship