CLAD: Efficient Log Anomaly Detection Directly on Compressed Representations

arXiv cs.LG / 4/15/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces CLAD, a deep learning framework for log anomaly detection that operates directly on compressed log byte streams instead of requiring full decompression and parsing.
It leverages the observation that normal logs produce regular byte patterns under compression, while anomalies introduce systematic multi-scale deviations.
CLAD uses a purpose-built architecture combining a dilated convolutional byte encoder, a hybrid Transformer–mLSTM module, and four-way aggregation pooling to model these deviations from “opaque” compressed bytes.
It employs a two-stage training approach—masked pre-training followed by focal-contrastive fine-tuning—to address severe class imbalance typical in anomaly detection.
Across five datasets, CLAD achieves a state-of-the-art average F1-score of 0.9909, improving the best baseline by 2.72 percentage points while eliminating decompression/parsing overheads for streaming.

Abstract

The explosive growth of system logs makes streaming compression essential, yet existing log anomaly detection (LAD) methods incur severe pre-processing overhead by requiring full decompression and parsing. We introduce CLAD, the first deep learning framework to perform LAD directly on compressed byte streams. CLAD bypasses these bottlenecks by exploiting a key insight: normal logs compress into regular byte patterns, while anomalies systematically disrupt them. To extract these multi-scale deviations from opaque bytes, we propose a purpose-built architecture integrating a dilated convolutional byte encoder, a hybrid Transformer--mLSTM, and four-way aggregation pooling. This is coupled with a two-stage training strategy of masked pre-training and focal-contrastive fine-tuning to effectively handle severe class imbalance. Evaluated across five datasets, CLAD achieves a state-of-the-art average F1-score of 0.9909 and outperforms the best baseline by 2.72 percentage points. It delivers superior accuracy while completely eliminating decompression and parsing overheads, offering a robust solution that generalizes to structured streaming compressors.

RAG in Practice — Part 4: Chunking, Retrieval, and the Decisions That Break RAG

Dev.to

Why dynamically routing multi-timescale advantages in PPO causes policy collapse (and a simple decoupled fix) [R]

Reddit r/MachineLearning

How AI Interview Assistants Are Changing Job Preparation in 2026

Dev.to

Claude Code's New Terminal Chat: Connect with Other Devs via P2P

Dev.to

How to Manage Multiple Claude Code Sessions with Harness and Preview

Dev.to

CLAD: Efficient Log Anomaly Detection Directly on Compressed Representations

Key Points

Abstract

Related Articles

RAG in Practice — Part 4: Chunking, Retrieval, and the Decisions That Break RAG

Why dynamically routing multi-timescale advantages in PPO causes policy collapse (and a simple decoupled fix) [R]

How AI Interview Assistants Are Changing Job Preparation in 2026

Claude Code's New Terminal Chat: Connect with Other Devs via P2P

How to Manage Multiple Claude Code Sessions with Harness and Preview

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer