A Survey of Spatial Memory Representations for Efficient Robot Navigation

arXiv cs.CV / 4/21/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that vision-based robots in large environments can face unbounded growth of spatial memory, exhausting resources on embedded platforms where hardware upgrades aren’t feasible.
  • It surveys 88 references across 52 robot systems (1989–2025) and introduces the metric α (peak runtime memory divided by saved persistent map size) to expose the gap between reported map sizes and real deployment memory costs.
  • Independent profiling on an NVIDIA A100 shows α varies by two orders of magnitude even within neural methods, indicating that memory architecture—not simply the method label—drives whether deployment is practical.
  • The authors propose a standardized evaluation protocol (memory growth rate, query latency, memory-completeness curves, throughput degradation) because existing benchmarks don’t reflect real deployment constraints.
  • A Pareto/frontier analysis finds no single paradigm dominates across regimes, and the paper provides independently measured α reference values plus an α-aware budgeting algorithm for pre-deployment feasibility assessment.

Abstract

As vision-based robots navigate larger environments, their spatial memory grows without bound, eventually exhausting computational resources, particularly on embedded platforms (8-16GB shared memory, <30W) where adding hardware is not an option. This survey examines the spatial memory efficiency problem across 88 references spanning 52 systems (1989-2025), from occupancy grids to neural implicit representations. We introduce the \alpha = M_{\text{peak}} / M_{\text{map}}, the ratio of peak runtime memory (the total RAM or GPU memory consumed during operation) to saved map size (the persistent checkpoint written to disk), exposing the gap between published map sizes and actual deployment cost. Independent profiling on an NVIDIA A100 GPU reveals that \alpha spans two orders of magnitude within neural methods alone, ranging from 2.3 (Point-SLAM) to 215 (NICE-SLAM, whose 47,MB map requires 10GB at runtime), showing that memory architecture, not paradigm label, determines deployment feasibility. We propose a standardized evaluation protocol comprising memory growth rate, query latency, memory-completeness curves, and throughput degradation, none of which current benchmarks capture. Through a Pareto frontier analysis with explicit benchmark separation, we show that no single paradigm dominates within its evaluation regime: 3DGS methods achieve the best absolute accuracy at 90-254,MB map size on Replica, while scene graphs provide semantic abstraction at predictable cost. We provide the first independently measured \alpha reference values and an \alpha-aware budgeting algorithm enabling practitioners to assess deployment feasibility on target hardware prior to implementation.