Incompleteness of AI Safety Verification via Kolmogorov Complexity

arXiv cs.AI / 4/7/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that the difficulty of verifying AI safety and policy compliance is not only due to computational limits or model expressiveness, but also due to intrinsic information-theoretic barriers.
It formalizes policy compliance as a verification problem over encoded system behaviors and analyzes the limits using Kolmogorov complexity.
The authors prove an incompleteness theorem: for any fixed sound computably enumerable verifier, there is a complexity threshold beyond which true policy-compliant instances cannot be certified.
This implies that no finite formal verifier can guarantee certification for all policy-compliant instances with arbitrarily high complexity, even ignoring resource constraints.
The work motivates “proof-carrying” approaches that can provide instance-level correctness guarantees rather than relying solely on finite, fixed verifiers.

Abstract

Ensuring that artificial intelligence (AI) systems satisfy formal safety and policy constraints is a central challenge in safety-critical domains. While limitations of verification are often attributed to combinatorial complexity and model expressiveness, we show that they arise from intrinsic information-theoretic limits. We formalize policy compliance as a verification problem over encoded system behaviors and analyze it using Kolmogorov complexity. We prove an incompleteness result: for any fixed sound computably enumerable verifier, there exists a threshold beyond which true policy-compliant instances cannot be certified once their complexity exceeds that threshold. Consequently, no finite formal verifier can certify all policy-compliant instances of arbitrarily high complexity. This reveals a fundamental limitation of AI safety verification independent of computational resources, and motivates proof-carrying approaches that provide instance-level correctness guarantees.