GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation

Microsoft Research Blog / 3/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The article describes challenges in using vision-language models (VLMs) for robot manipulation, specifically difficulty in choosing which actions to take and determining where to take them over long horizons.
It notes that many existing systems decouple planning and execution by having a VLM output a natural-language plan and a separate model convert that plan into executable actions, which can cause failures.
It introduces “GroundedPlanBench,” a spatially grounded approach/dataset (and corresponding framing) intended to improve long-horizon task planning by tying plans to spatial information rather than relying only on language-level instructions.
The focus is on advancing end-to-end grounding for robot manipulation tasks, improving the reliability of decision-making for action selection and spatial placement.

Vision-language models (VLMs) use images and text to plan robot actions, but they still struggle to decide what actions to take and where to take them. Most systems split these decisions into two steps: a VLM generates a plan in natural language, and a separate model translates it into executable actions. This approach often breaks […]

The post GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation appeared first on Microsoft Research.

What Is Artificial Intelligence and How Does It Actually Work?

Dev.to

Cortex – A Local-First Knowledge Graph for Developers

Dev.to

SmartLead Architect: Building an AI-Driven Lead Scoring and Outreach Engine

Dev.to

How Messaging Apps Became the Next Platform for AI

Dev.to

AI Beyond the Hype

Dev.to

GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation

Key Points

Related Articles

What Is Artificial Intelligence and How Does It Actually Work?

Cortex – A Local-First Knowledge Graph for Developers

SmartLead Architect: Building an AI-Driven Lead Scoring and Outreach Engine

How Messaging Apps Became the Next Platform for AI

AI Beyond the Hype

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer