The Multiverse of Time Series Machine Learning: an Archive for Multivariate Time Series Classification

arXiv cs.LG / 3/24/2026

📰 NewsIdeas & Deep AnalysisTools & Practical UsageModels & Research

Key Points

  • The paper announces a major expansion of the existing UEA multivariate time-series classification benchmark archive, growing it from 30 classification datasets to 133, plus additional variants.
  • It releases preprocessed versions that address common real-world issues such as missing values and unequal-length series, increasing the overall dataset count to 147.
  • The archive is rebranded as the “Multiverse” repository to reflect its broader diversity of domains and to consolidate multiple existing collections into one unified source.
  • To make experimentation feasible, the authors recommend a smaller “Multiverse-core (MV-core)” subset for initial exploration, alongside baseline evaluations and performance benchmarks.
  • A dedicated repository is provided with an aeon and scikit-learn compatible framework, reproducibility support, an extensive record of published results, and an interactive interface for exploring benchmark outcomes.

Abstract

Time series machine learning (TSML) is a growing research field that spans a wide range of tasks. The popularity of established tasks such as classification, clustering, and extrinsic regression has, in part, been driven by the availability of benchmark datasets. An archive of 30 multivariate time series classification datasets, introduced in 2018 and commonly known as the UEA archive, has since become an essential resource cited in hundreds of publications. We present a substantial expansion of this archive that more than quadruples its size, from 30 to 133 classification problems. We also release preprocessed versions of datasets containing missing values or unequal length series, bringing the total number of datasets to 147. Reflecting the growth of the archive and the broader community, we rebrand it as the Multiverse archive to capture its diversity of domains. The Multiverse archive includes datasets from multiple sources, consolidating other collections and standalone datasets into a single, unified repository. Recognising that running experiments across the full archive is computationally demanding, we recommend a subset of the full archive called Multiverse-core (MV-core) for initial exploration. To support researchers in using the new archive, we provide detailed guidance and a baseline evaluation of established and recent classification algorithms, establishing performance benchmarks for future research. We have created a dedicated repository for the Multiverse archive that provides a common aeon and scikit-learn compatible framework for reproducibility, an extensive record of published results, and an interactive interface to explore the results.