Abstract Sim2Real through Approximate Information States

arXiv cs.RO / 4/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper formalizes an “abstract sim2real” setting where an abstract simulator omits important task details, yet an RL policy should still be transferable to the real world.
It reframes the problem using RL state abstraction, showing that an abstract simulator can be grounded to the target task when the grounded abstract dynamics depend on state history.
The authors propose a method that leverages real-world task data to correct and align the dynamics of the abstract simulator.
Experiments indicate the approach supports successful policy transfer in both sim-to-sim and sim-to-real evaluations.
The work is motivated by the practical difficulty of achieving high-fidelity simulators as robotics deployments expand into more complex, real-world domains.

Abstract

In recent years, reinforcement learning (RL) has shown remarkable success in robotics when a fast and accurate simulator is available for a given task. When using RL and simulation, more simulator realism is generally beneficial but becomes harder to obtain as robots are deployed in increasingly complex and widescale domains. In such settings, simulators will likely fail to model all relevant details of a given target task and this observation motivates the study of sim2real with simulators that leave out key task details. In this paper, we formalize and study the abstract sim2real problem: given an abstract simulator that models a target task at a coarse level of abstraction, how can we train a policy with RL in the abstract simulator and successfully transfer it to the real-world? Our first contribution is to formalize this problem using the language of state abstraction from the RL literature. This framing shows that an abstract simulator can be grounded to match the target task if the grounded abstract dynamics take the history of states into account. Based on the formalism, we then introduce a method that uses real-world task data to correct the dynamics of the abstract simulator. We then show that this method enables successful policy transfer both in sim2sim and sim2real evaluation.