EchoChain: A Full-Duplex Benchmark for State-Update Reasoning Under Interruptions
arXiv cs.CL / 4/21/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces EchoChain, a new benchmark designed to evaluate how real-time voice assistants update task state when users interrupt them mid-response.
- EchoChain isolates recurring failure patterns in post-interruption continuations, including contextual inertia, interruption amnesia, and objective displacement.
- It generates scenario-based conversations and inserts interruptions at a standardized time relative to when the assistant begins speaking, allowing consistent cross-model comparisons.
- Results across evaluated real-time voice models show that no system exceeds a 50% pass rate, indicating significant weaknesses in mid-generation state revision.
- A paired half-duplex control shows total failures drop by 40.2% versus interrupted runs, suggesting many errors stem specifically from interruption-driven state-update reasoning rather than overall task difficulty.
Related Articles

Every time a new model comes out, the old one is obsolete of course
Reddit r/LocalLLaMA

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆
Dev.to

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)
Dev.to

Building AgentOS: Why I’m Building the AWS Lambda for Insurance Claims
Dev.to

Where we are. In a year, everything has changed. Kimi - Minimax - Qwen - Gemma - GLM
Reddit r/LocalLLaMA