Context-Sensitive Abstractions for Reinforcement Learning with Parameterized Actions

arXiv cs.AI / 4/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper addresses reinforcement learning (RL) for long-horizon, sparse-reward problems with parameterized action spaces that combine discrete choices and continuous parameters.
It argues that existing planning methods and standard RL algorithms are poorly suited to this mixed action setting, and that prior parameterized-action RL approaches often require domain-specific engineering and underuse the structure of these spaces.
The authors propose RL algorithms that learn state and action abstractions online, progressively refining them to add more detail only in the important regions of the state-action space.
Experiments on multiple continuous-state, parameterized-action domains show that the abstraction-driven method improves sample efficiency, with TD(λ) achieving notably higher results than strong baselines.
Overall, the work extends RL to better exploit latent structure in parameterized-action environments without heavy manual modeling.

Abstract

Real-world sequential decision-making often involves parameterized action spaces that require both, decisions regarding discrete actions and decisions about continuous action parameters governing how an action is executed. Existing approaches exhibit severe limitations in this setting -- planning methods demand hand-crafted action models, and standard reinforcement learning (RL) algorithms are designed for either discrete or continuous actions but not both, and the few RL methods that handle parameterized actions typically rely on domain-specific engineering and fail to exploit the latent structure of these spaces. This paper extends the scope of RL algorithms to long-horizon, sparse-reward settings with parameterized actions by enabling agents to autonomously learn both state and action abstractions online. We introduce algorithms that progressively refine these abstractions during learning, increasing fine-grained detail in the critical regions of the state-action space where greater resolution improves performance. Across several continuous-state, parameterized-action domains, our abstraction-driven approach enables TD(

\lambda

) to achieve markedly higher sample efficiency than state-of-the-art baselines.

LLMs will be a commodity

Reddit r/artificial

Indian Developers: How to Build AI Side Income with $0 Capital in 2026

Dev.to

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Reddit r/LocalLLaMA

Dex lands $5.3M to grow its AI-driven talent matching platform

Tech.eu

AI Citation Registry: Why Daily Updates Leave No Time for Data Structuring

Dev.to

Context-Sensitive Abstractions for Reinforcement Learning with Parameterized Actions

Key Points

Abstract

Related Articles

LLMs will be a commodity

Indian Developers: How to Build AI Side Income with $0 Capital in 2026

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Dex lands $5.3M to grow its AI-driven talent matching platform

AI Citation Registry: Why Daily Updates Leave No Time for Data Structuring

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer