Spatial Atlas: Compute-Grounded Reasoning for Spatial-Aware Research Agent Benchmarks

arXiv cs.AI / 4/15/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces compute-grounded reasoning (CGR), a paradigm for spatial-aware research agents that resolves each sub-problem via deterministic computation before an LLM generates the final response.
Spatial Atlas implements CGR using a single Agent-to-Agent (A2A) server that supports two benchmarks: FieldWorkArena for multimodal spatial QA and MLE-Bench covering 75 Kaggle ML competitions requiring end-to-end engineering.
A structured spatial scene-graph engine extracts entities and relations from vision descriptions, deterministically computes distances and safety violations, and passes these computed facts to LLMs to reduce hallucinated spatial reasoning.
The system uses entropy-guided action selection for efficient information gain and routes queries across a three-tier frontier model stack (OpenAI + Anthropic).
It also includes a self-healing ML pipeline with strategy-aware code generation, an iterative refinement loop guided by scoring, and a prompt-based “leak audit” registry for reliability and interpretability.

Abstract

We introduce compute-grounded reasoning (CGR), a design paradigm for spatial-aware research agents in which every answerable sub-problem is resolved by deterministic computation before a language model is asked to generate. Spatial Atlas instantiates CGR as a single Agent-to-Agent (A2A) server that handles two challenging benchmarks: FieldWorkArena, a multimodal spatial question-answering benchmark spanning factory, warehouse, and retail environments, and MLE-Bench, a suite of 75 Kaggle machine learning competitions requiring end-to-end ML engineering. A structured spatial scene graph engine extracts entities and relations from vision descriptions, computes distances and safety violations deterministically, then feeds computed facts to large language models, thereby avoiding hallucinated spatial reasoning. Entropy-guided action selection maximizes information gain per step and routes queries across a three-tier frontier model stack (OpenAI + Anthropic). A self-healing ML pipeline with strategy-aware code generation, a score-driven iterative refinement loop, and a prompt-based leak audit registry round out the system. We evaluate across both benchmarks and show that CGR yields competitive accuracy while maintaining interpretability through structured intermediate representations and deterministic spatial computations.

Black Hat Asia

AI Business

The Complete Guide to Better Meeting Productivity with AI Note-Taking

Dev.to

5 Ways Real-Time AI Can Boost Your Sales Call Performance

Dev.to

RAG in Practice — Part 4: Chunking, Retrieval, and the Decisions That Break RAG

Dev.to

Why dynamically routing multi-timescale advantages in PPO causes policy collapse (and a simple decoupled fix) [R]

Reddit r/MachineLearning

Spatial Atlas: Compute-Grounded Reasoning for Spatial-Aware Research Agent Benchmarks

Key Points

Abstract

Related Articles

Black Hat Asia

The Complete Guide to Better Meeting Productivity with AI Note-Taking

5 Ways Real-Time AI Can Boost Your Sales Call Performance

RAG in Practice — Part 4: Chunking, Retrieval, and the Decisions That Break RAG

Why dynamically routing multi-timescale advantages in PPO causes policy collapse (and a simple decoupled fix) [R]

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer