Revisable by Design: A Theory of Streaming LLM Agent Execution

arXiv cs.LG / 4/28/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that most LLM agents effectively treat execution like a single transaction, forcing users to either wait for an answer or interrupt and lose all prior progress.
  • It proposes a “stream” execution paradigm where agent work and user intervention run concurrently over a bidirectional channel, enabling interleaved revisions during execution.
  • The authors introduce a reversibility taxonomy (Idempotent, Reversible, Compensable, Irreversible) and show that an agent’s ability to adapt during revisions is fundamentally limited by the reversibility properties of its actions.
  • They prove that conflicting compensable actions create unavoidable adaptation costs and that conflicting irreversible actions can prevent full satisfaction of the requested specification.
  • They present the “Revision Absorber” algorithm, validated on StreamBench experiments, which achieves near full-restart quality while reusing much more already-completed work (about an order of magnitude fewer wasted steps).

Abstract

Current LLM agents operate under an implicit but universal assumption: execution is a transaction -- the user submits a request, the agent works in isolation, and only upon completion does the dialogue resume. This forces users into a binary choice: wait for a potentially incorrect output, or interrupt and lose all progress. We reject this assumption and propose the stream paradigm, in which agent execution and user intervention are concurrent, interleaved processes sharing a bidirectional channel. We formalize this paradigm through a reversibility taxonomy that classifies every agent action as Idempotent, Reversible, Compensable, or Irreversible, and arrive at a core conclusion: an agent's flexibility is bounded by its reversibility. We prove that conflicting compensable actions impose unavoidable adaptation costs and that conflicting irreversible actions make full specification satisfaction impossible -- these costs are properties of the action space, not of the algorithm. Guided by this insight, we present the Revision Absorber, a reactive algorithm based on the Earliest-Conflict Rollback rule that is structurally optimal under mild assumptions. Experiments on StreamBench with real LLM agents validate all predictions: the Absorber matches the quality of a brute-force full-restart baseline while wasting an order of magnitude fewer steps of already-completed work, turning mid-execution revisions from a dead-end into a first-class interaction.