EngineAD: A Real-World Vehicle Engine Anomaly Detection Dataset

arXiv cs.LG / 3/30/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • EngineAD is a newly introduced multivariate benchmark dataset for vehicle anomaly detection, built from high-resolution telemetry collected from 25 commercial vehicles over six months.
  • The dataset emphasizes real operational behavior rather than synthetic data, with normal conditions and early indicators of incipient engine faults labeled through expert annotations.
  • Data is preprocessed into 300-timestep segments represented by eight principal components, and the authors provide an initial benchmark using nine one-class anomaly detection models.
  • Results show substantial performance variation across vehicles, highlighting difficulties in cross-vehicle generalization for real-world deployments.
  • The study finds that simple classical one-class methods such as K-Means and One-Class SVM can be highly competitive—sometimes outperforming deep learning—on this segment-based evaluation.

Abstract

The progress of Anomaly Detection (AD) in safety-critical domains, such as transportation, is severely constrained by the lack of large-scale, real-world benchmarks. To address this, we introduce EngineAD, a novel, multivariate dataset comprising high-resolution sensor telemetry collected from a fleet of 25 commercial vehicles over a six-month period. Unlike synthetic datasets, EngineAD features authentic operational data labeled with expert annotations, distinguishing normal states from subtle indicators of incipient engine faults. We preprocess the data into 300-timestep segments of 8 principal components and establish an initial benchmark using nine diverse one-class anomaly detection models. Our experiments reveal significant performance variability across the vehicle fleet, underscoring the challenge of cross-vehicle generalization. Furthermore, our findings corroborate recent literature, showing that simple classical methods (e.g., K-Means and One-Class SVM) are often highly competitive with, or superior to, deep learning approaches in this segment-based evaluation. By publicly releasing EngineAD, we aim to provide a realistic, challenging resource for developing robust and field-deployable anomaly detection and anomaly prediction solutions for the automotive industry.