Keypoint-based Dynamic Object 6-DoF Pose Tracking via Event Camera

arXiv cs.CV / 4/28/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper targets accurate dynamic object 6-DoF pose estimation for robotics, arguing that conventional camera-based methods struggle with motion blur, sensor noise, and low-light conditions.
  • It uses event cameras to mitigate these issues, leveraging their high dynamic range and low latency for more reliable visual input.
  • The proposed pipeline detects keypoints from a time surface derived from the event stream, then tracks them continuously by using event polarity, spatial coordinates, and local event density.
  • To recover 6-DoF pose, it matches 2D keypoints to 3D model keypoints via hash mapping and applies the EPnP algorithm for pose estimation.
  • Experiments on both simulated and real event datasets show improved accuracy and robustness over existing event-based state-of-the-art approaches.

Abstract

Accurate 6-DoF pose estimation of objects is critical for robots to perform precise manipulation tasks. However, for dynamic object pose estimation, conventional camera-based approaches face several major challenges, such as motion blur, sensor noise, and low-light limitation. To address these issues, we employ event cameras, whose high dynamic range and low latency offer a promising solution. Furthermore, we propose a keypoint-based detection and tracking approach for dynamic object pose estimation. Firstly, a keypoint detection network is constructed to extract keypoints from the time surface generated by the event stream. Subsequently, the polarity and spatial coordinates of the events are leveraged, and the event density in the vicinity of each keypoint is utilized to achieve continuous keypoint tracking. Finally, a hash mapping is established between the 2D keypoints and the 3D model keypoints, and the EPnP algorithm is employed to estimate the 6-DoF pose. Experimental results demonstrate that, whether in simulated or real event environments, the proposed method outperforms the event-based state-of-the-art methods in terms of both accuracy and robustness.