GaussianGrow: Geometry-aware Gaussian Growing from 3D Point Clouds with Text Guidance

arXiv cs.CV / 4/8/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • GaussianGrow is a new method for generating 3D Gaussian Splatting primitives directly from 3D point clouds, aiming to overcome the lack of geometric priors in existing approaches.
  • The approach uses a text-guided Gaussian growing scheme with a multi-view diffusion model to synthesize consistent appearances, improving supervision quality from the input point clouds.
  • To reduce artifacts when fusing neighboring views, GaussianGrow constrains novel-view generation at camera poses chosen from overlapping regions across different views.
  • For hard-to-observe areas, it iteratively detects camera poses by finding the largest un-grown regions and fills them via inpainting of rendered views using a pretrained 2D diffusion model.
  • Extensive experiments on both synthetic and real-scanned point clouds evaluate text-guided Gaussian generation and show the method’s effectiveness under practical point-cloud conditions.

Abstract

3D Gaussian Splatting has demonstrated superior performance in rendering efficiency and quality, yet the generation of 3D Gaussians still remains a challenge without proper geometric priors. Existing methods have explored predicting point maps as geometric references for inferring Gaussian primitives, while the unreliable estimated geometries may lead to poor generations. In this work, we introduce GaussianGrow, a novel approach that generates 3D Gaussians by learning to grow them from easily accessible 3D point clouds, naturally enforcing geometric accuracy in Gaussian generation. Specifically, we design a text-guided Gaussian growing scheme that leverages a multi-view diffusion model to synthesize consistent appearances from input point clouds for supervision. To mitigate artifacts caused by fusing neighboring views, we constrain novel views generated at non-preset camera poses identified in overlapping regions across different views. For completing the hard-to-observe regions, we propose to iteratively detect the camera pose by observing the largest un-grown regions in point clouds and inpainting them by inpainting the rendered view with a pretrained 2D diffusion model. The process continues until complete Gaussians are generated. We extensively evaluate GaussianGrow on text-guided Gaussian generation from synthetic and even real-scanned point clouds. Project Page: https://weiqi-zhang.github.io/GaussianGrow