The Myhill-Nerode Theorem for Bounded Interaction: Canonical Abstractions via Agent-Bounded Indistinguishability

arXiv cs.AI / 3/24/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a bounded-interaction analogue of the Myhill–Nerode theorem, showing that any capacity-limited observer induces a canonical quotient on its environment by identifying observation histories that no bounded agent can distinguish.
  • For finite POMDPs, it constructs a closed-loop Wasserstein pseudometric using a fixed family of finite-state controllers and derives a probe-exact quotient that is canonical, minimal, and unique within that notion of bounded indistinguishability.
  • It introduces a “clock-aware” probe setting where the resulting quotient is decision-sufficient for objectives depending only on the agent’s observations and actions, and provides an approximation bound for settings with latent-state rewards via an observation-Lipschitz condition.
  • The main theoretical object is paired with experiments that explore a scalable deterministic-stationary coarsening, validated on classic benchmarks such as Tiger and GridWorld and further diagnostic case studies including RockSample to study approximation runtime and behavior.

Abstract

Any capacity-limited observer induces a canonical quotient on its environment: two situations that no bounded agent can distinguish are, for that agent, the same. We formalise this for finite POMDPs. A fixed probe family of finite-state controllers induces a closed-loop Wasserstein pseudometric on observation histories and a probe-exact quotient merging histories that no controller in the family can distinguish. The quotient is canonical, minimal, and unique-a bounded-interaction analogue of the Myhill-Nerode theorem. For clock-aware probes, it is exactly decision-sufficient for objectives that depend only on the agent's observations and actions; for latent-state rewards, we use an observation-Lipschitz approximation bound. The main theorem object is the clock-aware quotient; scalable deterministic-stationary experiments study a tractable coarsening with gap measured on small exact cases and explored empirically at larger scale. We validate theorem-level claims on Tiger and GridWorld. We also report operational case studies on Tiger, GridWorld, and RockSample as exploratory diagnostics of approximation behavior and runtime, not as theorem-facing evidence when no exact cross-family certificate is available; heavier stress tests are archived in the appendix and artifact package.