Masked IRL: LLM-Guided Reward Disambiguation from Demonstrations and Language

arXiv cs.RO / 4/1/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper tackles a key limitation in reward learning from demonstrations: with limited data, reward models can overfit to spurious correlations because demos show behavior without clarifying which aspects of the state truly matter.
It proposes Masked Inverse Reinforcement Learning (Masked IRL), which uses large language models to infer which state components are relevant based on natural-language instructions.
Masked IRL enforces invariance to irrelevant state details, aiming to improve generalization beyond what demonstrations alone can provide.
When instructions are ambiguous, the framework uses LLM reasoning to clarify them in the context of the demonstrations to better disambiguate among reward functions.
Experiments in simulation and on a real robot show Masked IRL outperforms prior language-conditioned IRL methods by up to 15% while requiring up to 4.7× less data, improving sample efficiency and robustness.

Abstract

Robots can adapt to user preferences by learning reward functions from demonstrations, but with limited data, reward models often overfit to spurious correlations and fail to generalize. This happens because demonstrations show robots how to do a task but not what matters for that task, causing the model to focus on irrelevant state details. Natural language can more directly specify what the robot should focus on, and, in principle, disambiguate between many reward functions consistent with the demonstrations. However, existing language-conditioned reward learning methods typically treat instructions as simple conditioning signals, without fully exploiting their potential to resolve ambiguity. Moreover, real instructions are often ambiguous themselves, so naive conditioning is unreliable. Our key insight is that these two input types carry complementary information: demonstrations show how to act, while language specifies what is important. We propose Masked Inverse Reinforcement Learning (Masked IRL), a framework that uses large language models (LLMs) to combine the strengths of both input types. Masked IRL infers state-relevance masks from language instructions and enforces invariance to irrelevant state components. When instructions are ambiguous, it uses LLM reasoning to clarify them in the context of the demonstrations. In simulation and on a real robot, Masked IRL outperforms prior language-conditioned IRL methods by up to 15% while using up to 4.7 times less data, demonstrating improved sample-efficiency, generalization, and robustness to ambiguous language. Project page: https://MIT-CLEAR-Lab.github.io/Masked-IRL and Code: https://github.com/MIT-CLEAR-Lab/Masked-IRL