Anomaly Detection Belongs in Your Database — built SIMD-accelerated isolation forests into Stratum's SQL engine [P]

Reddit r/MachineLearning / 5/6/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

Stratum, Stratum’s columnar analytics SQL engine for the JVM, now supports native anomaly detection by training and scoring isolation forest models directly from SQL.
Users can run queries like filtering transactions by an ANOMALY_SCORE threshold without using Python or building an external export/scoring pipeline.
The implementation claims SIMD acceleration via Java’s Vector API, achieving around 6 microseconds per transaction and integrating scoring into the query execution path.
The write-up explains why anomaly detection belongs in the database and provides benchmarks comparing the approach against PyOD and scikit-learn.
Stratum is open source under Apache 2.0, and the post highlights performance benefits from optimizations such as zone map pruning and chunked streaming during execution.

We added native anomaly detection in Stratum, our columnar analytics engine for the JVM. Train and score isolation forest models entirely from SQL — no Python, no export pipeline:

SELECT * FROM transactions WHERE ANOMALY_SCORE('fraud_model') > 0.7;

6 microseconds per transaction, SIMD-accelerated, runs inside the query engine. The full write-up covers why we built it, how isolation forests work, and benchmarks against PyOD/scikit-learn:

https://datahike.io/notes/anomaly-detection-in-your-database/

Stratum is open source (Apache 2.0): https://github.com/replikativ/stratum

Happy to answer questions about the implementation — the isolation forest is pure Java with Vector API SIMD, scoring is fused into the query execution pipeline so it benefits from zone map pruning and chunked streaming.

submitted by /u/flyingfruits
[link] [comments]

Black Hat USA

AI Business

Transform Your Blurry Photos into HD Masterpieces, Instantly!

Dev.to

6 New Moats for AI Agent Infrastructure — Trust Score, Deployment, SLA, Identity, Compliance-as-Code

Dev.to

Google Home’s Gemini AI can handle more complicated requests

The Verge

Exit Code 2: How Claude Hooks Turn Agentic Rules Into Runtime Barriers

Dev.to

Anomaly Detection Belongs in Your Database — built SIMD-accelerated isolation forests into Stratum's SQL engine [P]

Key Points

Related Articles

Black Hat USA

Transform Your Blurry Photos into HD Masterpieces, Instantly!

6 New Moats for AI Agent Infrastructure — Trust Score, Deployment, SLA, Identity, Compliance-as-Code

Google Home’s Gemini AI can handle more complicated requests

Exit Code 2: How Claude Hooks Turn Agentic Rules Into Runtime Barriers

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Related Articles

Black Hat USA

Transform Your Blurry Photos into HD Masterpieces, Instantly!

6 New Moats for AI Agent Infrastructure — Trust Score, Deployment, SLA, Identity, Compliance-as-Code

Google Home&#8217;s Gemini AI can handle more complicated requests

Exit Code 2: How Claude Hooks Turn Agentic Rules Into Runtime Barriers

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Google Home’s Gemini AI can handle more complicated requests