The Intelligent Disobedience Game: Formulating Disobedience in Stackelberg Games and Markov Decision Processes
arXiv cs.AI / 3/24/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper addresses “intelligent disobedience” in shared autonomy, where an assistive AI may need to override a human instruction to prevent harm.
- It proposes the Intelligent Disobedience Game (IDG), a sequential Stackelberg-style framework that models human leadership under asymmetric information and derives optimal strategies for both agents.
- The analysis identifies key strategic phenomena such as “safety traps,” where the system avoids harm indefinitely but may fail to accomplish the human’s intended goal.
- The work translates IDG into a shared-control Multi-Agent Markov Decision Process, creating a compact computational testbed for training reinforcement learning agents to learn safe non-compliance.
- The authors position IDG as both a theoretical foundation for agent development and an experimental foundation to study how humans perceive and trust disobedient AI.
Related Articles

Interactive Web Visualization of GPT-2
Reddit r/artificial
Stop Treating AI Interview Fraud Like a Proctoring Problem
Dev.to
[R] Causal self-attention as a probabilistic model over embeddings
Reddit r/MachineLearning
The 5 software development trends that actually matter in 2026 (and what they mean for your startup)
Dev.to
InVideo AI Review: Fast Finished
Dev.to