Safe Interactions via Monte Carlo Linear-Quadratic Games

arXiv cs.RO / 4/7/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a game-theoretic approach to safe human-robot interaction that does not depend on predicting human behavior, treating worst-case human actions as directly conflicting with the robot’s objective.
  • By modeling the interaction as a zero-sum linear-quadratic game and solving for the Nash equilibrium, the method derives robot policies that aim to maximize both safety and performance across a range of human decisions.
  • The authors introduce MCLQ, a computationally efficient algorithm that starts from a linear-quadratic approximation and then iteratively refines the policy using Monte Carlo search to converge toward the Nash equilibrium.
  • The approach is designed to support real-time safety adjustments and lets system designers tune conservativeness, reducing overreaction to unrealistic human behaviors.
  • Simulations and a user study report improvements in safety-related outcomes while also improving computation time and expected performance versus prior methods.

Abstract

Safety is critical during human-robot interaction. But -- because people are inherently unpredictable -- it is often difficult for robots to plan safe behaviors. Instead of relying on our ability to anticipate humans, here we identify robot policies that are robust to unexpected human decisions. We achieve this by formulating human-robot interaction as a zero-sum game, where (in the worst case) the human's actions directly conflict with the robot's objective. Solving for the Nash Equilibrium of this game provides robot policies that maximize safety and performance across a wide range of human actions. Existing approaches attempt to find these optimal policies by leveraging Hamilton-Jacobi analysis (which is intractable) or linear-quadratic approximations (which are inexact). By contrast, in this work we propose a computationally efficient and theoretically justified method that converges towards the Nash Equilibrium policy. Our approach (which we call MCLQ) leverages linear-quadratic games to obtain an initial guess at safe robot behavior, and then iteratively refines that guess with a Monte Carlo search. Not only does MCLQ provide real-time safety adjustments, but it also enables the designer to tune how conservative the robot is -- preventing the system from focusing on unrealistic human behaviors. Our simulations and user study suggest that this approach advances safety in terms of both computation time and expected performance. See videos of our experiments here: https://youtu.be/KJuHeiWVuWY.