Rendering Multi-Human and Multi-Object with 3D Gaussian Splatting

arXiv cs.CV / 4/6/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper addresses “Multi-Human Multi-Object” (MHMO) rendering, aiming to reconstruct dynamic scenes with multiple interacting people and objects from sparse-view inputs for applications like robotics and VR/AR digital twins.
It identifies two core challenges: maintaining view-consistent representations for each instance under heavy mutual occlusion, and explicitly modeling combinatorial dependencies created by inter-instance interactions.
To tackle this, the authors propose MM-GS, a hierarchical framework based on 3D Gaussian Splatting with a per-instance multi-view fusion step for consistent instance representations.
MM-GS also introduces a scene-level instance interaction module that uses a global scene graph to reason about relationships and refine instance attributes to better capture subtle contact and interaction effects.
Experiments on challenging datasets show the method achieves state-of-the-art performance, improving over strong baselines with higher-fidelity details and more plausible inter-instance contacts.

Abstract

Reconstructing dynamic scenes with multiple interacting humans and objects from sparse-view inputs is a critical yet challenging task, essential for creating high-fidelity digital twins for robotics and VR/AR. This problem, which we term Multi-Human Multi-Object (MHMO) rendering, presents two significant obstacles: achieving view-consistent representations for individual instances under severe mutual occlusion, and explicitly modeling the complex and combinatorial dependencies that arise from their interactions. To overcome these challenges, we propose MM-GS, a novel hierarchical framework built upon 3D Gaussian Splatting. Our method first employs a Per-Instance Multi-View Fusion module to establish a robust and consistent representation for each instance by aggregating visual information across all available views. Subsequently, a Scene-Level Instance Interaction module operates on a global scene graph to reason about relationships between all participants, refining their attributes to capture subtle interaction effects. Extensive experiments on challenging datasets demonstrate that our method significantly outperforms strong baselines, producing state-of-the-art results with high-fidelity details and plausible inter-instance contacts.

Black Hat Asia

AI Business

How Bash Command Safety Analysis Works in AI Systems

Dev.to

How I Built an AI Agent That Earns USDC While I Sleep — A Complete Guide

Dev.to

How to Get Better Output from AI Tools (Without Burning Time and Tokens)

Dev.to

How I Added LangChain4j Without Letting It Take Over My Spring Boot App

Dev.to

Rendering Multi-Human and Multi-Object with 3D Gaussian Splatting

Key Points

Abstract

Related Articles

Black Hat Asia

How Bash Command Safety Analysis Works in AI Systems

How I Built an AI Agent That Earns USDC While I Sleep — A Complete Guide

How to Get Better Output from AI Tools (Without Burning Time and Tokens)

How I Added LangChain4j Without Letting It Take Over My Spring Boot App

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer