MultiLoc: Multi-view Guided Relative Pose Regression for Fast and Robust Visual Re-Localization

arXiv cs.CV / 3/31/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • MultiLoc introduces a multi-view guided relative pose regression (RPR) approach that jointly fuses multiple reference views and their camera poses in a single forward pass for fast, zero-shot visual re-localization.
  • The method improves robustness by using globally consistent spatial and geometric understanding rather than relying on limited pairwise/local views.
  • MultiLoc adds a co-visibility-driven retrieval strategy to select geometrically relevant reference views, supplying more informative context for pose estimation.
  • Experiments on WaySpots, Cambridge Landmarks, and Indoor6 show consistent outperformance of existing SOTA relative pose regression methods, while results on MegaDepth-1500, ScanNet-1500, and ACID indicate SOTA relative pose estimation performance across both regression and non-regression baselines.
  • The work proposes a new visual re-localization benchmark and plans to release code publicly, supporting reproducibility and broader adoption.

Abstract

Relative Pose Regression (RPR) generalizes well to unseen environments, but its performance is often limited due to pairwise and local spatial views. To this end, we propose MultiLoc, a novel multi-view guided RPR model trained at scale, equipping relative pose regression with globally consistent spatial and geometric understanding. Specifically, our method jointly fuses multiple reference views and their associated camera poses in a single forward pass, enabling accurate zero-shot pose estimation with real-time efficiency. To reliably supply informative context, we further propose a co-visibility-driven retrieval strategy for geometrically relevant reference view selection. MultiLoc establishes a new benchmark in visual re-localization, consistently outperforming existing state-of-the-art (SOTA) relative pose regression (RPR) methods across diverse datasets, including WaySpots, Cambridge Landmarks, and Indoor6. Furthermore, MultiLoc's pose regressor exhibits SOTA performance in relative pose estimation, surpassing RPR, feature matching and non-regression-based techniques on the MegaDepth-1500, ScanNet-1500, and ACID benchmarks. These results demonstrate robust domain generalization of MultiLoc across indoor, outdoor and natural environments. Code will be made publicly available.