AI Navigate

ReLaGS: Relational Language Gaussian Splatting

arXiv cs.CV / 3/19/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • Introduces ReLaGS, a framework for unified 3D perception and reasoning across segmentation, retrieval, and relation understanding without scene-specific training.
  • Proposes a hierarchical language-distilled Gaussian scene and a 3D semantic scene graph, featuring Gaussian pruning to refine geometry and multi-view language alignment to map noisy 2D features into robust 3D object embeddings.
  • Builds open-vocabulary 3D scene graphs using Vision-Language derived annotations and Graph Neural Network-based relational reasoning for scalable inter- and intra-object relations.
  • Validated on open-vocabulary segmentation, scene graph generation, and relation-guided retrieval, with a project page at the provided link.

Abstract

Achieving unified 3D perception and reasoning across tasks such as segmentation, retrieval, and relation understanding remains challenging, as existing methods are either object-centric or rely on costly training for inter-object reasoning. We present a novel framework that constructs a hierarchical language-distilled Gaussian scene and its 3D semantic scene graph without scene-specific training. A Gaussian pruning mechanism refines scene geometry, while a robust multi-view language alignment strategy aggregates noisy 2D features into accurate 3D object embeddings. On top of this hierarchy, we build an open-vocabulary 3D scene graph with Vision Language derived annotations and Graph Neural Network-based relational reasoning. Our approach enables efficient and scalable open-vocabulary 3D reasoning by jointly modeling hierarchical semantics and inter/intra-object relationships, validated across tasks including open-vocabulary segmentation, scene graph generation, and relation-guided retrieval. Project page: https://dfki-av.github.io/ReLaGS/