SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

arXiv cs.CL / 4/16/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The SpatialEvo paper addresses a key bottleneck in 3D spatial reasoning: expensive geometric annotation and the tendency for self-evolving training to reinforce a model’s existing geometric errors via pseudo-label consensus.
It introduces Deterministic Geometric Environments (DGE), where ground-truth answers are computed exactly from point clouds and camera poses without any model involvement, providing objective physical feedback.
SpatialEvo defines 16 spatial reasoning task categories with explicit geometric validation rules, converting unannotated 3D scenes into zero-noise interactive oracles for training.
The framework uses a single shared-parameter policy that co-evolves across “questioner” and “solver” roles, with questions generated from scene observations and answers verified against DGE-derived ground truth.
Experiments on nine benchmarks report the best average scores at 3B and 7B parameter scales, improving spatial reasoning while maintaining general visual understanding performance.

Abstract

Spatial reasoning over three-dimensional scenes is a core capability for embodied intelligence, yet continuous model improvement remains bottlenecked by the cost of geometric annotation. The self-evolving paradigm offers a promising path, but its reliance on model consensus to construct pseudo-labels causes training to reinforce rather than correct the model's own geometric errors. We identify a property unique to 3D spatial reasoning that circumvents this limitation: ground truth is a deterministic consequence of the underlying geometry, computable exactly from point clouds and camera poses without any model involvement. Building on this insight, we present SpatialEvo, a self-evolving framework for 3D spatial reasoning, centered on the Deterministic Geometric Environment (DGE). The DGE formalizes 16 spatial reasoning task categories under explicit geometric validation rules and converts unannotated 3D scenes into zero-noise interactive oracles, replacing model consensus with objective physical feedback. A single shared-parameter policy co-evolves across questioner and solver roles under DGE constraints: the questioner generates physically valid spatial questions grounded in scene observations, while the solver derives precise answers against DGE-verified ground truth. A task-adaptive scheduler endogenously concentrates training on the model's weakest categories, producing a dynamic curriculum without manual design. Experiments across nine benchmarks demonstrate that SpatialEvo achieves the highest average score at both 3B and 7B scales, with consistent gains on spatial reasoning benchmarks and no degradation on general visual understanding.

Black Hat Asia

AI Business

The AI Hype Cycle Is Lying to You About What to Learn

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

OpenAI Codex April 2026 Update Review: Computer Use, Memory & 90+ Plugins — Is the Hype Real?

Dev.to

Factory hits $1.5B valuation to build AI coding for enterprises

TechCrunch

SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

Key Points

Abstract

Related Articles

Black Hat Asia

The AI Hype Cycle Is Lying to You About What to Learn

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

OpenAI Codex April 2026 Update Review: Computer Use, Memory & 90+ Plugins — Is the Hype Real?

Factory hits $1.5B valuation to build AI coding for enterprises

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer