CSLE: A Reinforcement Learning Platform for Autonomous Security Management

arXiv cs.AI / 4/20/2026

💬 OpinionDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper proposes CSLE, a reinforcement learning platform aimed at enabling more autonomous and adaptive security management in networked systems under realistic conditions.
  • CSLE combines an emulation component that virtualizes target system parts to collect measurements/logs and build a system model (e.g., a Markov decision process) with a simulation component used to learn security strategies efficiently.
  • Learned strategies are then evaluated and refined back in the emulation environment to reduce the performance gap between purely theoretical results and operational deployment.
  • The authors demonstrate CSLE across four security-management use cases—flow control, replication control, segmentation control, and recovery control—showing near-optimal outcomes in an environment approximating real operations.

Abstract

Reinforcement learning is a promising approach to autonomous and adaptive security management in networked systems. However, current reinforcement learning solutions for security management are mostly limited to simulation environments and it is unclear how they generalize to operational systems. In this paper, we address this limitation by presenting CSLE: a reinforcement learning platform for autonomous security management that enables experimentation under realistic conditions. Conceptually, CSLE encompasses two systems. First, it includes an emulation system that replicates key components of the target system in a virtualized environment. We use this system to gather measurements and logs, based on which we identify a system model, such as a Markov decision process. Second, it includes a simulation system where security strategies are efficiently learned through simulations of the system model. The learned strategies are then evaluated and refined in the emulation system to close the gap between theoretical and operational performance. We demonstrate CSLE through four use cases: flow control, replication control, segmentation control, and recovery control. Through these use cases, we show that CSLE enables near-optimal security management in an environment that approximates an operational system.