ATANT: An Evaluation Framework for AI Continuity

arXiv cs.AI / 4/10/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical UsageModels & Research

Key Points

  • The paper introduces ATANT, an open, system-agnostic evaluation framework that measures “AI continuity” (persistence, updating, disambiguation, and reconstruction of meaningful context over time) rather than just using memory components like RAG or long-context windows.
  • Continuity is defined via seven required properties, alongside a 10-checkpoint evaluation methodology that can run without an LLM inside the evaluation loop to avoid evaluation-time bias.
  • ATANT provides a narrative test corpus of 250 life-domain stories with 1,835 verification questions, enabling repeatable benchmarking across scenarios.
  • A reference implementation is evaluated over multiple suite iterations, improving from 58% with a legacy architecture to 100% in isolated testing and achieving 96% at the 250-story cumulative scale, where cross-contamination is a key failure mode.
  • The framework, example stories, and protocol are published on GitHub, with the full 250-story corpus planned for incremental release.

Abstract

We present ATANT (Automated Test for Acceptance of Narrative Truth), an open evaluation framework for measuring continuity in AI systems: the ability to persist, update, disambiguate, and reconstruct meaningful context across time. While the AI industry has produced memory components (RAG pipelines, vector databases, long context windows, profile layers), no published framework formally defines or measures whether these components produce genuine continuity. We define continuity as a system property with 7 required properties, introduce a 10-checkpoint evaluation methodology that operates without an LLM in the evaluation loop, and present a narrative test corpus of 250 stories comprising 1,835 verification questions across 6 life domains. We evaluate a reference implementation across 5 test suite iterations, progressing from 58% (legacy architecture) to 100% in isolated mode (250 stories) and 100% in 50-story cumulative mode, with 96% at 250-story cumulative scale. The cumulative result is the primary measure: when 250 distinct life narratives coexist in the same database, the system must retrieve the correct fact for the correct context without cross-contamination. ATANT is system-agnostic, model-independent, and designed as a sequenced methodology for building and validating continuity systems. The framework specification, example stories, and evaluation protocol are available at https://github.com/Kenotic-Labs/ATANT. The full 250-story corpus will be released incrementally.