Unveiling Uncertainty-Aware Autonomous Cooperative Learning Based Planning Strategy

arXiv cs.RO / 4/23/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes DRLACP, a deep reinforcement learning framework for autonomous cooperative planning (ACP) in multi-vehicle intelligent transportation systems where perception, planning, and communication uncertainties cannot be fully handled by existing methods.
  • It uses a Soft Actor-Critic (SAC) approach combined with gate recurrent units (GRUs) to learn time-varying optimal actions under imperfect state information.
  • The method targets uncertainties that arise during planning, communication, and perception, aiming to improve both effectiveness and security of cooperative motion.
  • Experiments in the CARLA simulation platform show that the learned cooperative planning outperforms baseline approaches across multiple scenarios with imperfect AV state information.

Abstract

In future intelligent transportation systems, autonomous cooperative planning (ACP), becomes a promising technique to increase the effectiveness and security of multi-vehicle interactions. However, multiple uncertainties cannot be fully addressed for existing ACP strategies, e.g. perception, planning, and communication uncertainties. To address these, a novel deep reinforcement learning-based autonomous cooperative planning (DRLACP) framework is proposed to tackle various uncertainties on cooperative motion planning schemes. Specifically, the soft actor-critic (SAC) with the implementation of gate recurrent units (GRUs) is adopted to learn the deterministic optimal time-varying actions with imperfect state information occurred by planning, communication, and perception uncertainties. In addition, the real-time actions of autonomous vehicles (AVs) are demonstrated via the Car Learning to Act (CARLA) simulation platform. Evaluation results show that the proposed DRLACP learns and performs cooperative planning effectively, which outperforms other baseline methods under different scenarios with imperfect AV state information.