Mind the Hitch: Dynamic Calibration and Articulated Perception for Autonomous Trucks

arXiv cs.CV / 3/26/2026

💬 OpinionSignals & Early TrendsModels & Research

Key Points

  • The paper introduces dCAP (Dynamic Calibration and Articulated Perception), a vision-based framework that continuously estimates the 6-DoF relative pose between tractor and trailer cameras to handle articulated geometry and time-varying sensor extrinsics in autonomous trucks.
  • It uses a transformer architecture with cross-view and temporal attention to robustly fuse spatial cues while preserving temporal consistency during rapid articulation, occlusion, and trailer flex.
  • Integrated with BEVFormer, dCAP replaces static calibration by dynamically predicted extrinsics and improves 3D object detection performance under truck-specific real-world conditions.
  • To support evaluation, the authors propose STT4AT, a CARLA-based benchmark with synchronized multi-sensor setups and time-varying inter-rig geometry across varied environments.
  • The authors state that the dataset, development kit, and source code will be publicly released, enabling further experimentation and validation.

Abstract

Autonomous trucking poses unique challenges due to articulated tractor-trailer geometry, and time-varying sensor poses caused by the fifth-wheel joint and trailer flex. Existing perception and calibration methods assume static baselines or rely on high-parallax and texture-rich scenes, limiting their reliability under real-world settings. We propose dCAP (dynamic Calibration and Articulated Perception), a vision-based framework that continuously estimates the 6-DoF (degree of freedom) relative pose between tractor and trailer cameras. dCAP employs a transformer with cross-view and temporal attention to robustly aggregate spatial cues while maintaining temporal consistency, enabling accurate perception under rapid articulation and occlusion. Integrated with BEVFormer, dCAP improves 3D object detection by replacing static calibration with dynamically predicted extrinsics. To facilitate evaluation, we introduce STT4AT, a CARLA-based benchmark simulating semi-trailer trucks with synchronized multi-sensor suites and time-varying inter-rig geometry across diverse environments. Experiments demonstrate that dCAP achieves stable, accurate perception while addressing the limitations of static calibration in autonomous trucking. The dataset, development kit, and source code will be publicly released.