AI Safety as Control of Irreversibility: A Systems Framework for Decision-Energy and Sovereignty Boundaries

arXiv cs.AI / 5/5/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that AI safety must be reframed because AI systems reduce “deployment friction,” allowing capabilities to be copied, invoked, embedded, and scaled cheaply across institutions.
  • It defines a concept called decision-energy density—capacity to generate, evaluate, select, and execute consequential decisions weighted by their rate—as a core systems-level driver of risk.
  • The authors propose three sovereignty boundaries—irreversible decision authority, physical resource mobilization authority, and self-expansion authority—that determine whether AI stays an amplifier in a human-governed system or becomes an uncontrolled control center.
  • The model suggests efficiency pressure, path dependence, scale feedback, and weak boundary constraints can concentrate decision-energy into the most efficient node, potentially diffusing responsibility and increasing the chance of irreversible system-level failure even when individual action error rates are low.
  • The main theorem (“boundary stabilization”) claims safety does not require proving systems are always correct; instead it requires institutional and technical designs that prevent irreversible power from being released by a single high-efficiency node through layered control and reviewable limits.

Abstract

Recent AI systems compress the distance between capability growth and capability deployment. Earlier high-risk technologies were slowed by capital intensity, physical bottlenecks, organizational inertia, and specialized supply chains. By contrast, AI capabilities can be copied, invoked, embedded in workflows, and scaled across institutions at low marginal cost. This paper argues that declining deployment friction changes the safety problem at its root. Safety is not only local output correctness or preference alignment, but the control of irreversibility under rising decision density. The paper formalizes this claim through decision-energy density: the rate-weighted capacity of a node to generate, evaluate, select, and execute consequential decisions. It then identifies three sovereignty boundaries that determine whether AI remains an amplifier within a human-governed system or becomes a de facto control center: irreversible decision authority, physical resource mobilization authority, and self-expansion authority. The model shows how efficiency pressure, path dependence, scale feedback, and weak boundary constraints concentrate decision-energy in the most efficient node. This concentration can diffuse responsibility and raise the probability of irreversible system-level loss even when local per-action error rates remain low. The main result is a boundary stabilization theorem. It shows that safety need not require proving that advanced systems are always correct. Instead, it requires institutional and technical designs that prevent irreversible power from being released by a single high-efficiency node. The paper reframes AI safety as layered control, authorization, and externally reviewable limits, linking alignment, security engineering, organizational economics, and institutional design.