SceneOrchestra: Efficient Agentic 3D Scene Synthesis via Full Tool-Call Trajectory Generation
arXiv cs.CV / 4/23/2026
📰 NewsDeveloper Stack & InfrastructureModels & Research
Key Points
- The paper introduces SceneOrchestra, a trainable framework for agentic 3D scene synthesis that improves upon LLM-orchestrated tool workflows that rely on step-by-step execute–review–reflect loops.
- It identifies two main shortcomings in prior methods: heuristic-driven next-tool/parameter decisions that can waste calls and reduce quality, and added latency from rendering and reviewing intermediate outputs after every step.
- SceneOrchestra optimizes the entire tool-call execution flow by generating complete tool-call trajectories in one shot and using a discriminator to evaluate full trajectories and choose the best candidate.
- The approach uses a two-phase training strategy (trajectory learning plus discriminator trajectory-quality training, followed by interleaved adaptation/distillation) and, during inference, runs only the orchestrator to execute full trajectories.
- Experiments report state-of-the-art 3D scene quality while also lowering runtime relative to previous methods, indicating both better efficiency and output fidelity.
Related Articles

Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans
Dev.to

Elevating Austria: Google invests in its first data center in the Alps.
Google Blog

OpenAI Just Named It Workspace Agents. We Open-Sourced Our Lark Version Six Months Ago
Dev.to

GPT Image 2 Subject-Lock Editing: A Practical Guide to input_fidelity
Dev.to

AI Tutor That Works Offline — Study Anywhere with EaseLearn AI
Dev.to