PointRFT: Explicit Reinforcement Fine-tuning for Point Cloud Few-shot Learning

arXiv cs.CV / 3/26/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces PointRFT, a reinforcement fine-tuning framework specifically designed for 3D point cloud representation learning under few-shot classification settings.
  • It adapts reward design ideas from RL-enhanced LLM training by proposing dedicated accuracy and dispersion reward functions to stabilize training and reduce distribution shift.
  • Experiments across three common 3D foundation models show PointRFT consistently beats vanilla supervised fine-tuning (SFT) on multiple benchmarks.
  • The authors also find that combining PointRFT into a hybrid Pretraining-SFT-RFT pipeline can significantly improve representational capacity, delivering state-of-the-art results especially when training data is scarce.

Abstract

Understanding spatial dynamics and semantics in point cloud is fundamental for comprehensive 3D comprehension. While reinforcement learning algorithms such as Group Relative Policy Optimization (GRPO) have recently achieved remarkable breakthroughs in large language models by incentivizing reasoning capabilities through strategic reward design, their potential remains largely unexplored in the 3D perception domain. This naturally raises a pivotal question: Can RL-based methods effectively empower 3D point cloud fine-tuning? In this paper, we propose PointRFT, the first reinforcement fine-tuning paradigm tailored specifically for point cloud representation learning. We select three prevalent 3D foundation models and devise specialized accuracy reward and dispersion reward functions to stabilize training and mitigate distribution shifts. Through comprehensive few-shot classification experiments comparing distinct training paradigms, we demonstrate that PointRFT consistently outperforms vanilla supervised fine-tuning (SFT) across diverse benchmarks. Furthermore, when organically integrated into a hybrid Pretraining-SFT-RFT paradigm, the representational capacity of point cloud foundation models is substantially unleashed, achieving state-of-the-art performance particularly under data-scarce scenarios.

PointRFT: Explicit Reinforcement Fine-tuning for Point Cloud Few-shot Learning | AI Navigate