BiTA: Bidirectional Gated Recurrent Unit-Transformer Aggregator in a Temporal Graph Network Framework for Alert Prediction in Computer Networks

arXiv cs.LG / 4/28/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces BiTA, a new temporal graph learning framework for proactive alert prediction in computer networks, aimed at improving defenses against evolving cyber threats.
  • BiTA enhances temporal graph neural networks by redesigning the temporal aggregation to jointly capture bidirectional sequential dependencies and long-range contextual relations, without changing the original TGN memory/message-passing structure.
  • Experiments on real-world alert datasets show that BiTA achieves significant gains over state-of-the-art temporal graph models across multiple metrics including AUC, average precision, mean reciprocal rank, and per-category accuracy.
  • BiTA delivers stronger performance in both transductive and inductive evaluation settings, indicating improved robustness and generalization in dynamic network environments.
  • The authors position BiTA as a scalable and interpretable approach for real-time cyber threat anticipation, supporting more intelligent intrusion detection systems.

Abstract

Proactive alert prediction in computer networks is critical for mitigating evolving cyber threats and enabling timely defensive actions. Temporal Graph Neural Networks (TGNs) provide a principled framework for modeling time-evolving interactions; however, existing TGN-based methods predominantly rely on unidirectional or single-mechanism temporal aggregation, which limits their ability to capture recursive, multi-scale temporal patterns commonly observed in real-world attack behaviors. In this paper, we propose BiTA, a Bidirectional Gated Recurrent Unit-Transformer Aggregator for temporal graph learning. Rather than introducing a deeper or higher-capacity model, BiTA redesigns the temporal aggregation function within the TGN framework by jointly encoding bidirectional sequential dependencies and long-range contextual relations over each node's temporal neighborhood. This aggregation strategy enables complementary temporal reasoning at different scales while preserving the original TGN memory and message-passing structure. We evaluate BiTA on real-world alert datasets, demonstrating significant improvements in key performance metrics such as area under the curve, average precision, mean reciprocal rank, and per-category prediction accuracy when compared to state-of-the-art temporal graph models. BiTA outperforms baseline methods under both transductive and inductive settings, highlighting its robustness and generalization capabilities in dynamic network environments. BiTA is a scalable and interpretable framework for real-time cyber threat anticipation, paving the way toward more intelligent and adaptive intrusion detection systems.