3D Smoke Scene Reconstruction Guided by Vision Priors from Multimodal Large Language Models

arXiv cs.CV / 4/8/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses 3D scene reconstruction from smoke-degraded multi-view images, where smoke causes scattering, view-dependent appearance shifts, and poor cross-view consistency.
  • It proposes a framework combining enhanced visual inputs (via Nano-Banana-Pro) with a smoke-specific 3D modeling approach.
  • The core contribution is Smoke-GS, a medium-aware 3D Gaussian Splatting method that uses explicit 3D Gaussians plus a lightweight view-dependent “medium branch” to model direction-dependent smoke effects.
  • The approach aims to retain the rendering efficiency of standard 3D Gaussian Splatting while improving robustness and producing more consistent, visually clear novel views in smoke.
  • Experimental results reported in the abstract indicate the method improves reconstruction and smoke restoration–oriented novel view synthesis under challenging conditions.

Abstract

Reconstructing 3D scenes from smoke-degraded multi-view images is particularly difficult because smoke introduces strong scattering effects, view-dependent appearance changes, and severe degradation of cross-view consistency. To address these issues, we propose a framework that integrates visual priors with efficient 3D scene modeling. We employ Nano-Banana-Pro to enhance smoke-degraded images and provide clearer visual observations for reconstruction and develop Smoke-GS, a medium-aware 3D Gaussian Splatting framework for smoke scene reconstruction and restoration-oriented novel view synthesis. Smoke-GS models the scene using explicit 3D Gaussians and introduces a lightweight view-dependent medium branch to capture direction-dependent appearance variations caused by smoke. Our method preserves the rendering efficiency of 3D Gaussian Splatting while improving robustness to smoke-induced degradation. Results demonstrate the effectiveness of our method for generating consistent and visually clear novel views in challenging smoke environments.