MANSION: Multi-floor lANguage-to-3D Scene generatIOn for loNg-horizon tasks
arXiv cs.CV / 3/13/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces MANSION, a language-driven framework that generates building-scale, multi-floor 3D environments for long-horizon robotic tasks.
- MansionWorld, a dataset with over 1,000 diverse buildings (from hospitals to offices), and a Task-Semantic Scene Editing Agent enabling open-vocabulary customization are released alongside the framework.
- The framework accounts for vertical structural constraints to create realistic, navigable buildings suitable for cross-floor planning and evaluation.
- Benchmark results show state-of-the-art agents degrade sharply in these settings, establishing MANSION as a critical testbed for next-generation spatial reasoning and planning.
Related Articles
How political censorship actually works inside Qwen, DeepSeek, GLM, and Yi: Ablation and behavioral results across 9 models
Reddit r/LocalLLaMA
Engenharia de Prompt: Por Que a Forma Como Você Pergunta Muda Tudo(Um guia introdutório)
Dev.to
The Obligor
Dev.to
The Markup
Dev.to
2026 年 AI 部落格變現完整攻略:從第一篇文章到月收入 $1000
Dev.to