PhyMix: Towards Physically Consistent Single-Image 3D Indoor Scene Generation with Implicit--Explicit Optimization
arXiv cs.CV / 4/14/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper highlights a key limitation of current single-image 3D indoor scene generators: they often look realistic but violate real-world physics, reducing usefulness for robotics and embodied AI.
- It introduces a unified Physics Evaluator with four major dimensions (geometric priors, contact, stability, deployability) split into nine sub-constraints, along with the first benchmark for measuring physical consistency.
- The authors find that leading methods are largely physics-unaware, motivating a new approach that explicitly incorporates physical feedback into generation.
- PhyMix is proposed as a two-part framework that combines implicit preference-driven optimization using Scene-GRPO with explicit test-time refinement via a plug-and-play Test-Time Optimizer (TTO) leveraging differentiable evaluator signals.
- Experiments on synthetic benchmarks and qualitative tests on stylized and real-world images show improved results in both visual fidelity and physical plausibility, and the authors plan to release code and models after publication.



