UniQueR: Unified Query-based Feedforward 3D Reconstruction

arXiv cs.CV / 3/25/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • UniQueR proposes a unified query-based feedforward framework for efficient, accurate 3D reconstruction from unposed images by treating reconstruction as sparse 3D query inference rather than dense per-pixel prediction.
  • The method learns a compact set of 3D anchor points as explicit geometric queries, allowing it to infer scene geometry in occluded regions within a single forward pass.
  • UniQueR encodes spatial and appearance priors in global 3D space (not per-frame camera space) and generates 3D Gaussians for differentiable rendering.
  • A decoupled cross-attention design and unified query interactions across multi-view features reduce memory and computational cost while improving geometric expressiveness.
  • Experiments on Mip-NeRF 360 and VR-NeRF report state-of-the-art rendering quality and geometric accuracy, with an order-of-magnitude fewer primitives than dense alternatives.

Abstract

We present UniQueR, a unified query-based feedforward framework for efficient and accurate 3D reconstruction from unposed images. Existing feedforward models such as DUSt3R, VGGT, and AnySplat typically predict per-pixel point maps or pixel-aligned Gaussians, which remain fundamentally 2.5D and limited to visible surfaces. In contrast, UniQueR formulates reconstruction as a sparse 3D query inference problem. Our model learns a compact set of 3D anchor points that act as explicit geometric queries, enabling the network to infer scene structure, including geometry in occluded regions--in a single forward pass. Each query encodes spatial and appearance priors directly in global 3D space (instead of per-frame camera space) and spawns a set of 3D Gaussians for differentiable rendering. By leveraging unified query interactions across multi-view features and a decoupled cross-attention design, UniQueR achieves strong geometric expressiveness while substantially reducing memory and computational cost. Experiments on Mip-NeRF 360 and VR-NeRF demonstrate that UniQueR surpasses state-of-the-art feedforward methods in both rendering quality and geometric accuracy, using an order of magnitude fewer primitives than dense alternatives.