ITS-Mina: A Harris Hawks Optimization-Based All-MLP Framework with Iterative Refinement and External Attention for Multivariate Time Series Forecasting

arXiv cs.LG / 5/1/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces ITS-Mina, an all-MLP framework aimed at improving multivariate time series forecasting performance while keeping computational costs low compared with Transformer-based approaches.
  • ITS-Mina uses an iterative refinement strategy that repeatedly applies a shared-parameter residual mixer stack to deepen temporal representations without increasing the number of distinct parameters.
  • It replaces conventional self-attention with an external attention module that leverages learnable memory units to model cross-sample global dependencies with linear computational complexity.
  • The framework includes a Harris Hawks Optimization (HHO) method to automatically tune dropout rates for adaptive, dataset-specific regularization.
  • Experiments on six benchmark datasets show ITS-Mina achieves state-of-the-art or highly competitive results versus eleven baseline models across multiple forecasting horizons.

Abstract

Multivariate time series forecasting plays a pivotal role in numerous real-world applications, including financial analysis, energy management, and traffic planning. While Transformer-based architectures have gained popularity for this task, recent studies reveal that simpler MLP-based models can achieve competitive or superior performance with significantly reduced computational cost. In this paper, we propose ITS-Mina, a novel all-MLP framework for multivariate time series forecasting that integrates three key innovations: (1) an iterative refinement mechanism that progressively enhances temporal representations by repeatedly applying a shared-parameter residual mixer stack, effectively deepening the model's computational capacity without multiplying the number of distinct parameters; (2) an external attention module that replaces traditional self-attention with learnable memory units, capturing cross-sample global dependencies at linear computational complexity; and (3) a Harris Hawks Optimization (HHO) algorithm for automatic dropout rate tuning, enabling adaptive regularization tailored to each dataset. Extensive experiments on six widely-used benchmark datasets demonstrate that ITS-Mina achieves state-of-the-art or highly competitive performance compared to eleven baseline models across multiple forecasting horizons.