120 Minutes and a Laptop: Minimalist Image-goal Navigation via Unsupervised Exploration and Offline RL

arXiv cs.RO / 3/30/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes MINav, an approach for image-goal visual navigation that avoids reliance on large pretraining datasets and heavy compute by learning from data collected in-house.
  • MINav frames navigation as offline goal-conditioned reinforcement learning, using unsupervised exploration to gather experience and hindsight goal relabeling to improve learning from collected trajectories.
  • The authors report that policies can be collected, trained, and deployed to real-world navigation in under 120 minutes using only a consumer laptop and without human intervention.
  • Experiments in both simulation and real environments indicate improved exploration efficiency, better performance than zero-shot navigation baselines, and favorable scaling with dataset size.
  • Overall, the work aims to reduce barriers to rapid robotic policy prototyping by demonstrating a fast, compute-light pipeline for real-world deployment.

Abstract

The prevailing paradigm for image-goal visual navigation often assumes access to large-scale datasets, substantial pretraining, and significant computational resources. In this work, we challenge this assumption. We show that we can collect a dataset, train an in-domain policy, and deploy it to the real world (1) in less than 120 minutes, (2) on a consumer laptop, (3) without any human intervention. Our method, MINav, formulates image-goal navigation as an offline goal-conditioned reinforcement learning problem, combining unsupervised data collection with hindsight goal relabeling and offline policy learning. Experiments in simulation and the real world show that MINav improves exploration efficiency, outperforms zero-shot navigation baselines in target environments, and scales favorably with dataset size. These results suggest that effective real-world robotic learning can be achieved with high computational efficiency, lowering the barrier to rapid policy prototyping and deployment.