Temporally Consistent Object 6D Pose Estimation for Robot Control

arXiv cs.RO / 5/5/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper targets a key limitation of single-view RGB 6D pose estimators: they can be accurate per frame but often fail to provide temporal consistency needed for stable robot feedback control.
  • It proposes a factor-graph-based online estimator that enforces temporal consistency by incorporating object motion models and explicitly estimating measurement uncertainty.
  • The approach integrates the motion model and uncertainty estimation into an optimization-based pipeline, using outlier rejection and smoothing to improve pose stability.
  • Experiments show significant gains on standardized 6D pose estimation benchmarks and demonstrate improved stability in a feedback-based robot control setup with camera-mounted tracking and a torque-controlled manipulator.

Abstract

Single-view RGB object pose estimators have reached a level of precision and efficiency that makes them good candidates for vision-based robot control. However, off-the-shelf methods lack temporal consistency and robustness that are mandatory for a stable feedback control. In this work, we develop a factor graph approach to enforce temporal consistency of the object pose estimates. In particular, the proposed approach: (i) incorporates object motion models, (ii) explicitly estimates the object pose measurement uncertainty, and (iii) integrates the above two components in an online optimization-based estimator. We demonstrate that with appropriate outlier rejection and smoothing using the proposed factor graph approach, we can significantly improve the results on standardized pose estimation benchmarks. We experimentally validate the stability of the proposed approach for a feedback-based robot control task in which the object is tracked by the camera attached to a torque controlled manipulator.