CLAD: Efficient Log Anomaly Detection Directly on Compressed Representations

arXiv cs.LG / 4/15/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces CLAD, a deep learning framework for log anomaly detection that operates directly on compressed log byte streams instead of requiring full decompression and parsing.
  • It leverages the observation that normal logs produce regular byte patterns under compression, while anomalies introduce systematic multi-scale deviations.
  • CLAD uses a purpose-built architecture combining a dilated convolutional byte encoder, a hybrid Transformer–mLSTM module, and four-way aggregation pooling to model these deviations from “opaque” compressed bytes.
  • It employs a two-stage training approach—masked pre-training followed by focal-contrastive fine-tuning—to address severe class imbalance typical in anomaly detection.
  • Across five datasets, CLAD achieves a state-of-the-art average F1-score of 0.9909, improving the best baseline by 2.72 percentage points while eliminating decompression/parsing overheads for streaming.

Abstract

The explosive growth of system logs makes streaming compression essential, yet existing log anomaly detection (LAD) methods incur severe pre-processing overhead by requiring full decompression and parsing. We introduce CLAD, the first deep learning framework to perform LAD directly on compressed byte streams. CLAD bypasses these bottlenecks by exploiting a key insight: normal logs compress into regular byte patterns, while anomalies systematically disrupt them. To extract these multi-scale deviations from opaque bytes, we propose a purpose-built architecture integrating a dilated convolutional byte encoder, a hybrid Transformer--mLSTM, and four-way aggregation pooling. This is coupled with a two-stage training strategy of masked pre-training and focal-contrastive fine-tuning to effectively handle severe class imbalance. Evaluated across five datasets, CLAD achieves a state-of-the-art average F1-score of 0.9909 and outperforms the best baseline by 2.72 percentage points. It delivers superior accuracy while completely eliminating decompression and parsing overheads, offering a robust solution that generalizes to structured streaming compressors.