Promoting Simple Agents: Ensemble Methods for Event-Log Prediction

arXiv cs.LG / 4/24/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper compares simple automata-based n-gram models with neural sequence models (LSTM and Transformers) for next-activity prediction in streaming event logs.
Experiments on both synthetic patterns and five process-mining datasets find that well-configured n-grams can reach accuracy comparable to neural models while using far fewer computational resources.
It reports that windowed neural architectures can produce unstable performance, whereas n-grams deliver more stable and consistent accuracy.
Classical ensemble approaches (e.g., voting across many n-grams) improve n-gram accuracy but increase memory use and inference latency due to parallel agent execution.
The authors introduce a “promotion” ensemble algorithm that dynamically selects between two active models during inference, achieving similar-or-better accuracy than non-windowed neural models with reduced computational cost.

Abstract

We compare lightweight automata-based models (n-grams) with neural architectures (LSTM, Transformer) for next-activity prediction in streaming event logs. Experiments on synthetic patterns and five real-world process mining datasets show that n-grams with appropriate context windows achieve comparable accuracy to neural models while requiring substantially fewer resources. Unlike windowed neural architectures, which show unstable performance patterns, n-grams provide stable and consistent accuracy. While we demonstrate that classical ensemble methods like voting improve n-gram performance, they require running many agents in parallel during inference, increasing memory consumption and latency. We propose an ensemble method, the promotion algorithm, that dynamically selects between two active models during inference, reducing overhead compared to classical voting schemes. On real-world datasets, these ensembles match or exceed the accuracy of non-windowed neural models with lower computational cost.