UltraG-Ray: Physics-Based Gaussian Ray Casting for Novel Ultrasound View Synthesis

arXiv cs.CV / 4/1/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes UltraG-Ray, a physics-based novel view synthesis method for ultrasound that integrates the ultrasound image formation process into simulation rather than relying purely on learned rendering.
  • UltraG-Ray represents the ultrasound scene using a learnable 3D Gaussian field and couples it with an efficient B-mode physics module that performs ultrasound-specific ray casting.
  • It explicitly encodes ultrasound parameters such as attenuation and reflection into the Gaussian-based representation, enabling more accurate view-dependent acoustic effects.
  • The authors report improved realism versus state-of-the-art approaches, including up to a 15% gain in MS-SSIM, indicating a smaller simulation-to-reality gap.
  • The work is positioned as useful for anatomically plausible view generation that can support clinician training and data augmentation under complex tissue conditions.

Abstract

Novel view synthesis (NVS) in ultrasound has gained attention as a technique for generating anatomically plausible views beyond the acquired frames, offering new capabilities for training clinicians or data augmentation. However, current methods struggle with complex tissue and view-dependent acoustic effects. Physics-based NVS aims to address these limitations by including the ultrasound image formation process into the simulation. Recent approaches combine a learnable implicit scene representation with an ultrasound-specific rendering module, yet a substantial gap between simulation and reality remains. In this work, we introduce UltraG-Ray, a novel ultrasound scene representation based on a learnable 3D Gaussian field, coupled to an efficient physics-based module for B-mode synthesis. We explicitly encode ultrasound-specific parameters, such as attenuation and reflection, into a Gaussian-based spatial representation and realize image synthesis within a novel ray casting scheme. In contrast to previous methods, this approach naturally captures view-dependent attenuation effects, thereby enabling the generation of physically informed B-mode images with increased realism. We compare our method to state-of-the-art and observe consistent gains in image quality metrics (up to 15% increase on MS-SSIM), demonstrating clear improvement in terms of realism of the synthesized ultrasound images.