ReMAP-DP: Reprojected Multi-view Aligned PointMaps for Diffusion Policy

arXiv cs.RO / 3/23/2026

📰 NewsModels & Research

共有:

Key Points

ReMAP-DP proposes a structure-aware dual-stream diffusion policy that fuses re-projected views with pixel-aligned PointMaps and learnable modality embeddings to jointly leverage frozen semantic features and explicit geometric descriptors for precise patch-level alignment.
It overcomes the limitations of sparse point clouds and geometric distortion from multi-view rendering by fusing perspective reprojection with geometry-aware representations.
Empirical results on RoboTwin 2.0 show 59.3% average success ( +6.6% over the DP3 baseline), and ManiSkill 3 reports a 28% improvement on the Stack Cube task, with strong real-world robustness and data efficiency from few demonstrations.
The work demonstrates cross-domain performance in simulation and real-world environments and provides a project page for more details.

Abstract

Generalist robot policies built upon 2D visual representations excel at semantic reasoning but inherently lack the explicit 3D spatial awareness required for high-precision tasks. Existing 3D integration methods struggle to bridge this gap due to the structural irregularity of sparse point clouds and the geometric distortion introduced by multi-view orthographic rendering. To overcome these barriers, we present ReMAP-DP, a novel framework synergizing standardized perspective reprojection with a structure-aware dual-stream diffusion policy. By coupling the re-projected views with pixel-aligned PointMaps, our dual-stream architecture leverages learnable modality embeddings to fuse frozen semantic features and explicit geometric descriptors, ensuring precise implicit patch-level alignment. Extensive experiments across simulation and real-world environments demonstrate ReMAP-DP's superior performance in diverse manipulation tasks. On RoboTwin 2.0, it attains a 59.3% average success rate, outperforming the DP3 baseline by +6.6%. On ManiSkill 3, our method yields a 28% improvement over DP3 on the geometrically challenging Stack Cube task. Furthermore, ReMAP-DP exhibits remarkable real-world robustness, executing high-precision and dynamic manipulations with superior data efficiency from only a handful of demonstrations. Project page is available at: https://icr-lab.github.io/ReMAP-DP/

I made a 'benchmark' where LLMs write code controlling units in a 1v1 RTS game.

Dev.to

My AI Does Not Have a Clock

Dev.to

How to settle on a coding LLM ? What parameters to watch out for ?

Reddit r/LocalLLaMA

Andrej Karpathy's autonomous AI research agent ran 700 experiments in 2 days and gave a glimpse of where AI is heading

Reddit r/artificial

So cursor admits that Kimi K2.5 is the best open source model

Reddit r/LocalLLaMA

ReMAP-DP: Reprojected Multi-view Aligned PointMaps for Diffusion Policy

Key Points

Abstract

Related Articles

I made a 'benchmark' where LLMs write code controlling units in a 1v1 RTS game.

My AI Does Not Have a Clock

How to settle on a coding LLM ? What parameters to watch out for ?

Andrej Karpathy's autonomous AI research agent ran 700 experiments in 2 days and gave a glimpse of where AI is heading

So cursor admits that Kimi K2.5 is the best open source model

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer