Did You Forget What I Asked? Prospective Memory Failures in Large Language Models
arXiv cs.LG / 3/26/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper investigates why large language models often fail formatting instructions when they are required to complete demanding tasks at the same time, framing it as a prospective-memory problem from cognitive psychology.
- Across three model families and 8,000+ prompts, formatting compliance drops by 2–21% under concurrent task load, indicating a measurable interference effect.
- The vulnerability is strongly constraint-type dependent: terminal (response-boundary) constraints suffer the most, with compliance reductions up to 50%, while avoidance constraints degrade less.
- Adding salience-enhancing formatting—explicit instruction framing plus a trailing reminder—substantially recovers compliance, bringing many settings back to 90–100%.
- The study also finds bidirectional interference (formatting constraints can reduce task accuracy, e.g., GSM8K dropping from 93% to 27%), and shows compliance worsens as more constraints are stacked, using programmatic checkers on public datasets.
Related Articles
Regulating Prompt Markets: Securities Law, Intellectual Property, and the Trading of Prompt Assets
Dev.to
Mercor competitor Deccan AI raises $25M, sources experts from India
Dev.to
How We Got Local MCP Servers Working in Claude Cowork (The Missing Guide)
Dev.to
How Should Students Document AI Usage in Academic Work?
Dev.to
They Did Not Accidentally Make Work the Answer to Who You Are
Dev.to