Session Risk Memory (SRM): Temporal Authorization for Deterministic Pre-Execution Safety Gates
arXiv cs.AI / 3/25/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that deterministic per-action safety gates can be defeated by distributed attacks that split harmful intent across individually compliant steps, leaving a gap in “temporal” security at the session/trajectory level.
- It introduces Session Risk Memory (SRM), a lightweight deterministic module that adds trajectory-level authorization by maintaining a compact semantic centroid and accumulating a risk signal via an exponential moving average of baseline-subtracted gate outputs.
- SRM is designed to require no additional model components, training, or probabilistic inference, because it operates on the same semantic vector representation as the underlying authorization gate.
- Experiments on an 80-session multi-turn benchmark (slow-burn exfiltration, gradual privilege escalation, and compliance drift) show ILION+SRM achieving F1=1.0000 with 0% false positives versus stateless ILION at F1=0.9756 with a 5% false-positive rate, while keeping 100% detection for both.
- The approach formalizes a distinction between spatial authorization consistency (per action) and temporal authorization consistency (over trajectory), aiming to provide a principled basis for session-level safety in agentic systems with <250 microseconds per-turn overhead.
Related Articles
Santa Augmentcode Intent Ep.6
Dev.to

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.
Dev.to
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Palantir’s billionaire CEO says only two kinds of people will succeed in the AI era: trade workers — ‘or you’re neurodivergent’
Reddit r/artificial
Scaffolded Test-First Prompting: Get Correct Code From the First Run
Dev.to