Dynamic-TD3: A Novel Algorithm for UAV Path Planning with Dynamic Obstacle Trajectory Prediction

arXiv cs.AI / 5/4/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The article introduces Dynamic-TD3, a deep reinforcement learning framework for UAV path planning that is designed for safety-critical environments with dynamic threats.
  • It formulates navigation as a Constrained Markov Decision Process (CMDP) to enforce hard safety constraints while preserving maneuverability through dual-criterion control using Lagrangian relaxation.
  • The approach adds ATREM to capture long-range obstacle trajectory intentions and uses a Physically Aware Gated Kalman Filter (PAG-KF) to reduce the impact of non-stationary sensor noise.
  • In experiments against aggressive moving threats, Dynamic-TD3 reportedly improves collision avoidance, lowers energy consumption, and produces smoother trajectories compared with prior methods.

Abstract

Deep reinforcement learning (DRL) finds extensive application in autonomous drone navigation within complex, high-risk environments. However, its practical deployment faces a safety-exploration dilemma: soft penalty mechanisms encourage risky trial-and-error, while most constraint-based methods suffer degraded performance under sensor noise and intent uncertainty. We propose Dynamic-TD3, a physically enhanced framework that enforces strict safety constraints while maintaining maneuverability by modeling navigation as a Constrained Markov Decision Process (CMDP). This framework integrates an Adaptive Trajectory Relational Evolution Mechanism (ATREM) to capture long-range intentions and employs a Physically Aware Gated Kalman Filter (PAG-KF) to mitigate non-stationary observation noise. The resulting state representation drives a dual-criterion policy that balances mission efficiency against hard safety constraints via Lagrangian relaxation. In experiments with aggressive dynamic threats, this approach demonstrates superior collision avoidance performance, reduced energy consumption, and smoother flight trajectories.