Springdrift: An Auditable Persistent Runtime for LLM Agents with Case-Based Memory, Normative Safety, and Ambient Self-Perception

arXiv cs.AI / 4/7/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

Springdriftは、長期稼働するLLMエージェント向けの「永続ランタイム」で、追記型メモリや監督付きプロセス、gitベースの復旧により実行を監査可能にする仕組みを統合しています。
ケースベース推論に基づくメモリ層とハイブリッド検索（密ベースラインとしてコサイン類似度を評価）を組み合わせ、セッションを跨いだ文脈保持を狙っています。
決定論的な規範（normative）計算による安全ゲーティングを導入し、どの公理に基づいて判断したかを監査可能な「公理トレイル」として残す設計です。
自己状態（sensorium）をツール呼び出しなしで各サイクルに注入し、継続的な自己認識にもとづく自己診断・故障モード分類などを試みています。
単一インスタンスを23日間（19稼働日）運用し、メール/ウェブ間での無指示の文脈維持や、自己のインフラ不具合診断・脆弱性特定などをケーススタディとして報告しています。

Abstract

We present Springdrift, a persistent runtime for long-lived LLM agents. The system integrates an auditable execution substrate (append-only memory, supervised processes, git-backed recovery), a case-based reasoning memory layer with hybrid retrieval (evaluated against a dense cosine baseline), a deterministic normative calculus for safety gating with auditable axiom trails, and continuous ambient self-perception via a structured self-state representation (the sensorium) injected each cycle without tool calls. These properties support behaviours difficult to achieve in session-bounded systems: cross-session task continuity, cross-channel context maintenance, end-to-end forensic reconstruction of decisions, and self-diagnostic behaviour. We report on a single-instance deployment over 23 days (19 operating days), during which the agent diagnosed its own infrastructure bugs, classified failure modes, identified an architectural vulnerability, and maintained context across email and web channels -- without explicit instruction. We introduce the term Artificial Retainer for this category: a non-human system with persistent memory, defined authority, domain-specific autonomy, and forensic accountability in an ongoing relationship with a specific principal -- distinguished from software assistants and autonomous agents, drawing on professional retainer relationships and the bounded autonomy of trained working animals. This is a technical report on a systems design and deployment case study, not a benchmark-driven evaluation. Evidence is from a single instance with a single operator, presented as illustration of what these architectural properties can support in practice. Implemented in approximately Gleam on Erlang/OTP. Code, artefacts, and redacted operational logs will be available at https://github.com/seamus-brady/springdrift upon publication.