ArtifactWorld: Scaling 3D Gaussian Splatting Artifact Restoration via Video Generation Models

arXiv cs.CV / 4/15/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • ArtifactWorld is a research framework for restoring degraded 3D Gaussian Splatting (3DGS) results under sparse-view conditions by addressing temporal coherence, spatial constraints, and limited training data.
  • The work introduces a fine-grained taxonomy of 3DGS artifact types and builds a large-scale training set of 107.5K paired video clips to improve robustness and generalization across real-world artifact distributions.
  • It unifies restoration using a video diffusion backbone plus an artifact heatmap produced by an isomorphic predictor to localize structural defects.
  • An Artifact-Aware Triplet Fusion mechanism then guides intensity-directed spatio-temporal repair within native self-attention, reducing multi-view inconsistencies and geometric hallucinations.
  • Experiments report state-of-the-art performance for sparse novel view synthesis and more robust 3D reconstruction, with code and dataset planned for public release.

Abstract

3D Gaussian Splatting (3DGS) delivers high-fidelity real-time rendering but suffers from geometric and photometric degradations under sparse-view constraints. Current generative restoration approaches are often limited by insufficient temporal coherence, a lack of explicit spatial constraints, and a lack of large-scale training data, resulting in multi-view inconsistencies, erroneous geometric hallucinations, and limited generalization to diverse real-world artifact distributions. In this paper, we present ArtifactWorld, a framework that resolves 3DGS artifact repair through systematic data expansion and a homogeneous dual-model paradigm. To address the data bottleneck, we establish a fine-grained phenomenological taxonomy of 3DGS artifacts and construct a comprehensive training set of 107.5K diverse paired video clips to enhance model robustness. Architecturally, we unify the restoration process within a video diffusion backbone, utilizing an isomorphic predictor to localize structural defects via an artifact heatmap. This heatmap then guides the restoration through an Artifact-Aware Triplet Fusion mechanism, enabling precise, intensity-guided spatio-temporal repair within native self-attention. Extensive experiments demonstrate that ArtifactWorld achieves state-of-the-art performance in sparse novel view synthesis and robust 3D reconstruction. Code and dataset will be made public.