LLM-Guided Task- and Affordance-Level Exploration in Reinforcement Learning

arXiv cs.RO / 4/15/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes LLM-TALE, an RL framework that uses LLM planning to steer exploration at both the task level and the affordance (action-meaning) level to improve sample efficiency in robotic manipulation.
It addresses a key limitation of earlier LLM-guided exploration methods: LLMs may generate semantically plausible but physically infeasible plans, so LLM-TALE performs online correction of suboptimal planning rather than assuming optimality.
LLM-TALE explores multimodal affordance-level plans without human supervision, contrasting with approaches that rely on human-provided rewards or optimal LLM-generated plans.
Experiments on pick-and-place tasks in standard RL benchmarks show improved sample efficiency and higher success rates compared with strong baselines.
Real-robot tests suggest promising zero-shot sim-to-real transfer, and the authors provide code and supplementary materials via the project website.

Abstract

Reinforcement learning (RL) is a promising approach for robotic manipulation, but it can suffer from low sample efficiency and requires extensive exploration of large state-action spaces. Recent methods leverage the commonsense knowledge and reasoning abilities of large language models (LLMs) to guide exploration toward more meaningful states. However, LLMs can produce plans that are semantically plausible yet physically infeasible, yielding unreliable behavior. We introduce LLM-TALE, a framework that uses LLMs' planning to directly steer RL exploration. LLM-TALE integrates planning at both the task level and the affordance level, improving learning efficiency by directing agents toward semantically meaningful actions. Unlike prior approaches that assume optimal LLM-generated plans or rewards, LLM-TALE corrects suboptimality online and explores multimodal affordance-level plans without human supervision. We evaluate LLM-TALE on pick-and-place tasks in standard RL benchmarks, observing improvements in both sample efficiency and success rates over strong baselines. Real-robot experiments indicate promising zero-shot sim-to-real transfer. Code and supplementary material are available at llm-tale.github.io.