Self-Predictive Representation for Autonomous UAV Object-Goal Navigation

arXiv cs.RO / 4/24/2026

📰 NewsModels & Research

Key Points

  • The paper introduces a reinforcement-learning approach to 3D object-goal navigation for autonomous UAVs, explicitly modeling the unknown target location as a Markov decision process.
  • It targets the key challenge of sample inefficiency in RL for learning effective navigation policies, especially when target recognition adds complexity to OGN.
  • The main technical contribution is a new perception model, AmelPred, including a stochastic variant (AmelPredSto) for learning state representations from perception.
  • Experiments evaluate how different state representation learning (SRL) methods interact with a model-free actor-critic RL planning algorithm, finding that AmelPredSto performs best.
  • Using AmelPredSto yields substantial improvements in the efficiency of RL algorithms when solving the 3D OGN task.

Abstract

Autonomous Unmanned Aerial Vehicles (UAVs) have revolutionized industries through their versatility with applications including aerial surveillance, search and rescue, agriculture, and delivery. Their autonomous capabilities offer unique advantages, such as operating in large open space environments. Reinforcement Learning (RL) empowers UAVs to learn intricate navigation policies, enabling them to optimize flight behavior autonomously. However, one of its main challenge is the inefficiency in using data sample to achieve a good policy. In object-goal navigation (OGN) settings, target recognition arises as an extra challenge. Most UAV-related approaches use relative or absolute coordinates to move from an initial position to a predefined location, rather than to find the target directly. This study addresses the data sample efficiency issue in solving a 3D OGN problem, in addition to, the formalization of the unknown target location setting as a Markov decision process. Experiments are conducted to analyze the interplay of different state representation learning (SRL) methods for perception with a model-free RL algorithm for planning in an autonomous navigation system. The main contribution of this study is the development of the perception module, featuring a novel self-predictive model named AmelPred. Empirical results demonstrate that its stochastic version, AmelPredSto, is the best-performing SRL model when combined with actor-critic RL algorithms. The obtained results show substantial improvement in RL algorithms' efficiency by using AmelPredSto in solving the OGN problem.