WorldFlow3D: Flowing Through 3D Distributions for Unbounded World Generation

arXiv cs.AI / 4/1/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces WorldFlow3D, a flow-matching-based method for generating unbounded 3D worlds for scene modeling in vision, graphics, and robotics.
  • WorldFlow3D treats 3D generation as “flowing through” 3D data distributions using a latent-free approach rather than being restricted to conditional denoising.
  • The method aims to produce causal and accurate 3D structure while using simpler generated structure as an intermediate distribution to improve more complex geometry and high-quality textures.
  • It provides controllability via vectorized scene layout conditions for geometric structure and scene attributes for texture control.
  • Experiments on real outdoor driving scenes and synthetic indoor scenes show cross-domain generalizability and faster convergence with favorable generation fidelity versus tested baselines.

Abstract

Unbounded 3D world generation is emerging as a foundational task for scene modeling in computer vision, graphics, and robotics. In this work, we present WorldFlow3D, a novel method capable of generating unbounded 3D worlds. Building upon a foundational property of flow matching - namely, defining a path of transport between two data distributions - we model 3D generation more generally as a problem of flowing through 3D data distributions, not limited to conditional denoising. We find that our latent-free flow approach generates causal and accurate 3D structure, and can use this as an intermediate distribution to guide the generation of more complex structure and high-quality texture - all while converging more rapidly than existing methods. We enable controllability over generated scenes with vectorized scene layout conditions for geometric structure control and visual texture control through scene attributes. We confirm the effectiveness of WorldFlow3D on both real outdoor driving scenes and synthetic indoor scenes, validating cross-domain generalizability and high-quality generation on real data distributions. We confirm favorable scene generation fidelity over approaches in all tested settings for unbounded scene generation. For more, see https://light.princeton.edu/worldflow3d.