Fast Online 3D Multi-Camera Multi-Object Tracking and Pose Estimation

arXiv cs.CV / 4/21/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper introduces a fast, online approach that jointly performs 3D multi-object tracking and pose estimation using multiple monocular cameras.
  • It only needs 2D bounding box and pose detections, avoiding the need for expensive 3D training data or computationally heavy deep learning models.
  • The method implements a Bayes-optimal multi-object tracking filter to improve computational efficiency while preserving accuracy.
  • Experiments show the proposed algorithm is significantly faster than state-of-the-art approaches without sacrificing accuracy, relying on publicly available pre-trained 2D detectors.
  • The system is demonstrated to remain robust even when multiple cameras disconnect and later reconnect intermittently during operation.

Abstract

This paper proposes a fast and online method for jointly performing 3D multi-object tracking and pose estimation using multiple monocular cameras. Our algorithm requires only 2D bounding box and pose detections, eliminating the need for costly 3D training data or computationally expensive deep learning models. Our solution is an efficient implementation of a Bayes-optimal multi-object tracking filter, enhancing computational efficiency while maintaining accuracy. We demonstrate that our algorithm is significantly faster than state-of-the-art methods without compromising accuracy, using only publicly available pre-trained 2D detection models. We also illustrate the robust performance of our algorithm in scenarios where multiple cameras are intermittently disconnected or reconnected during operation.