Benchmarking Efficient & Effective Camera Pose Estimation Strategies for Novel View Synthesis

arXiv cs.CV / 3/24/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • 本論文は、NeRFや3DGSのためのNovel View Synthesisで必要となるカメラポーズ推定を対象に、SfMベース手法をベンチマーク化する狙いを示しています。
  • 従来の古典的SfM(特徴点マッチ+バンドル調整)は高精度だが計算コストが高く、最適化を捨ててニューラルネットで回帰する手法は高速でも精度が課題になりがちだと整理しています。
  • ベンチマークでは、(1) 使う特徴点数を減らすだけで古典的SfMが大幅に高速化しつつ高いポーズ精度を維持できることを示しています。
  • さらに、(2) フィードフォワード(Transformer系)ネットワークで初期推定し、その後古典的SfMで精密化することで、効率と有効性の最良のトレードオフが得られると報告しています。
  • 公開予定のベンチマークとコードにより、効率的かつ効果的なSfM設計の研究を促進することを目指しています。

Abstract

Novel view synthesis (NVS) approaches such as NeRFs or 3DGS can produce photo-realistic 3D scene representation from a set of images with known extrinsic and intrinsic parameters. The necessary camera poses and calibrations are typically obtained from the images via Structure-from-Motion (SfM). Classical SfM approaches rely on local feature matches between the images to estimate both the poses and a sparse 3D model of the scene, using bundle adjustment to refine initial pose, intrinsics, and geometry estimates. In order to increase run-time efficiency, recent SfM systems forgo optimization via bundle adjustment. Instead, they train feed-forward (transformer-based) neural networks to directly regress camera parameters and the 3D structure. While orders of magnitude more efficient, such recent works produce significantly less accurate estimates. To stimulate research on developing SfM approaches that are both efficient \emph{and} effective, this paper develops a benchmark focused on SfM for novel view synthesis. Using existing datasets and two simple strategies for making the reconstruction process more efficient, we show that: (1) simply using fewer features already significantly accelerates classical SfM methods while maintaining high pose accuracy. (2) using feed-forward networks to obtain initial estimates and refining them using classical SfM techniques leads to the best efficiency-effectiveness trade-off. We will make our benchmark and code publicly available.