MASS: Mesh-inellipse Aligned Deformable Surfel Splatting for Hand Reconstruction and Rendering from Egocentric Monocular Video

arXiv cs.CV / 4/13/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes MASS (Mesh-inellipse Aligned deformable Surfel Splatting) to reconstruct high-fidelity 3D hands from egocentric monocular video despite challenges in geometry capture, occlusions/hand-object interactions, and complex scene content.
  • MASS represents hand surfaces using a deformable 2D Gaussian Surfel representation initialized from coarse parametric hand meshes via mesh-aligned Steiner Inellipse and fractal densification for photorealistic rendering potential.
  • It introduces Gaussian Surfel Deformation to model hand deformations and personalized appearance by predicting residual updates to surfel attributes and using an opacity mask to refine geometry/texture without adaptive density control.
  • The method uses a two-stage training strategy and a novel binding loss to improve optimization robustness and reconstruction quality.
  • Experiments on ARCTIC, Hand Appearance, and Interhand2.6M datasets show MASS outperforming state-of-the-art methods for reconstruction quality.

Abstract

Reconstructing high-fidelity 3D hands from egocentric monocular videos remains a challenge due to the limitations in capturing high-resolution geometry, hand-object interactions, and complex objects on hands. Additionally, existing methods often incur high computational costs, making them impractical for real-time applications. In this work, we propose Mesh-inellipse Aligned deformable Surfel Splatting (MASS) to address these challenges by leveraging a deformable 2D Gaussian Surfel representation. We introduce the mesh-aligned Steiner Inellipse and fractal densification for mesh-to-surfel conversion that initiates high-resolution 2D Gaussian surfels from coarse parametric hand meshes, providing surface representation with photorealistic rendering potential. Second, we propose Gaussian Surfel Deformation, which enables efficient modeling of hand deformations and personalized features by predicting residual updates to surfel attributes and introducing an opacity mask to refine geometry and texture without adaptive density control. In addition, we propose a two-stage training strategy and a novel binding loss to improve the optimization robustness and reconstruction quality. Extensive experiments on the ARCTIC dataset, the Hand Appearance dataset, and the Interhand2.6M dataset demonstrate that our model achieves superior reconstruction performance compared to state-of-the-art methods.