Deformation-based In-Context Learning for Point Cloud Understanding

arXiv cs.CV / 4/6/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces DeformPIC, a deformation-based framework for point cloud In-Context Learning that replaces Masked Point Modeling (MPM)-style masked reconstruction with learned deformations guided by prompts.
  • It argues that MPM-based methods lack geometric priors and face a training–inference mismatch because they rely on target-side information unavailable during inference.
  • DeformPIC instead performs explicit geometric reasoning by deforming a query point cloud under task-specific prompt guidance, aligning the learning objective more closely with inference-time behavior.
  • Experiments report consistent state-of-the-art improvements, including average Chamfer Distance reductions of 1.6 (reconstruction), 1.8 (denoising), and 4.7 (registration) versus prior approaches.
  • The authors also propose a new out-of-domain benchmark for generalization across unseen data distributions, where DeformPIC attains state-of-the-art results.

Abstract

Recent advances in point cloud In-Context Learning (ICL) have demonstrated strong multitask capabilities. Existing approaches typically adopt a Masked Point Modeling (MPM)-based paradigm for point cloud ICL. However, MPM-based methods directly predict the target point cloud from masked tokens without leveraging geometric priors, requiring the model to infer spatial structure and geometric details solely from token-level correlations via transformers. Additionally, these methods suffer from a training-inference objective mismatch, as the model learns to predict the target point cloud using target-side information that is unavailable at inference time. To address these challenges, we propose DeformPIC, a deformation-based framework for point cloud ICL. Unlike existing approaches that rely on masked reconstruction, DeformPIC learns to deform the query point cloud under task-specific guidance from prompts, enabling explicit geometric reasoning and consistent objectives. Extensive experiments demonstrate that DeformPIC consistently outperforms previous state-of-the-art methods, achieving reductions of 1.6, 1.8, and 4.7 points in average Chamfer Distance on reconstruction, denoising, and registration tasks, respectively. Furthermore, we introduce a new out-of-domain benchmark to evaluate generalization across unseen data distributions, where DeformPIC achieves state-of-the-art performance.