SpatialGrammar: A Domain-Specific Language for LLM-Based 3D Indoor Scene Generation
arXiv cs.AI / 5/1/2026
📰 NewsDeveloper Stack & InfrastructureModels & Research
Key Points
- The paper introduces SpatialGrammar, a domain-specific language designed to let LLM-based systems generate interactive 3D indoor scenes from natural language while reducing spatial errors and object collisions.
- SpatialGrammar uses gravity-aligned layouts encoded as BEV grid placements with deterministic compilation into valid 3D geometry, enabling verifiable constraint checking.
- The authors propose SG-Agent, a closed-loop framework that iteratively refines generated scenes using compiler feedback to enforce collision constraints and improve physical plausibility.
- They also present SG-Mini, a 104M-parameter model trained solely on compiler-validated synthetic data, which performs competitively on single-shot indoor scene generation.
- Experiments on 159 test scenes across five complexity scenarios show that SG-Agent improves spatial fidelity and physical plausibility over prior approaches, while SG-Mini matches larger LLM baselines in relevant settings.
Related Articles

Why Autonomous Coding Agents Keep Failing — And What Actually Works
Dev.to

Text-to-image is easy. Chaining LLMs to generate, critique, and iterate on images autonomously is a routing nightmare. AgentSwarms now supports Image generation playground and creative media workflows!
Reddit r/artificial

Automating FDA Compliance: AI for Specialty Food Producers
Dev.to

Mistral's new flagship Medium 3.5 folds chat, reasoning, and code into one model
THE DECODER
I hate this group but not literally
Reddit r/LocalLLaMA