Predictive Representations for Skill Transfer in Reinforcement Learning

arXiv cs.LG / 4/9/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a core scaling problem in reinforcement learning: enabling agents to generalize learned behaviors across tasks rather than relearning from scratch.
  • It proposes Outcome-Predictive State Representations (OPSRs), a task-independent form of state abstraction built from predictions of environment outcomes.
  • The authors show that OPSRs enable optimal but limited transfer, establishing a formal and empirical trade-off between transfer quality and scope.
  • To overcome that limitation, they introduce OPSR-based skills (options-style abstract actions) that can be reused across tasks thanks to the state abstraction.
  • Experiments indicate that skills learned from demonstrations can substantially speed up learning on entirely new, unseen tasks without any additional pre-processing.

Abstract

A key challenge in scaling up Reinforcement Learning is generalizing learned behaviour. Without the ability to carry forward acquired knowledge an agent is doomed to learn each task from scratch. In this paper we develop a new formalism for transfer by virtue of state abstraction. Based on task-independent, compact observations (outcomes) of the environment, we introduce Outcome-Predictive State Representations (OPSRs), agent-centered and task-independent abstractions that are made up of predictions of outcomes. We show formally and empirically that they have the potential for optimal but limited transfer, then overcome this trade-off by introducing OPSR-based skills, i.e. abstract actions (based on options) that can be reused between tasks as a result of state abstraction. In a series of empirical studies, we learn OPSR-based skills from demonstrations and show how they speed up learning considerably in entirely new and unseen tasks without any pre-processing. We believe that the framework introduced in this work is a promising step towards transfer in RL in general, and towards transfer through combining state and action abstraction specifically.