An End-to-end Flight Control Network for High-speed UAV Obstacle Avoidance based on Event-Depth Fusion

arXiv cs.RO / 3/31/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces an end-to-end flight control network for high-speed UAV obstacle avoidance that fuses depth-camera data and event-camera data to handle both static and dynamic obstacles.
  • It performs feature-level fusion using a bidirectional crossattention module, motivated by the complementary limitations of depth (motion blur at speed) and event cameras (poor static-scene perception).
  • The system is trained with imitation learning using high-quality supervision, and it includes an expert planner based on Spherical Principal Search (SPS).
  • The SPS planner reduces computational complexity from O(n^2) to O(n) while producing smoother trajectories, achieving over 80% success at 17 m/s (about 20% better than traditional planners).
  • Simulation results show 70–80% success at 17 m/s across varied scenes, outperforming single-modality and unidirectional fusion baselines by 10–20%, indicating bidirectional fusion improves obstacle avoidance reliability.

Abstract

Achieving safe, high-speed autonomous flight in complex environments with static, dynamic, or mixed obstacles remains challenging, as a single perception modality is incomplete. Depth cameras are effective for static objects but suffer from motion blur at high speeds. Conversely, event cameras excel at capturing rapid motion but struggle to perceive static scenes. To exploit the complementary strengths of both sensors, we propose an end-to-end flight control network that achieves feature-level fusion of depth images and event data through a bidirectional crossattention module. The end-to-end network is trained via imitation learning, which relies on high-quality supervision. Building on this insight, we design an efficient expert planner using Spherical Principal Search (SPS). This planner reduces computational complexity from O(n^2) to O(n) while generating smoother trajectories, achieving over 80% success rate at 17m/s--nearly 20% higher than traditional planners. Simulation experiments show that our method attains a 70-80% success rate at 17 m/s across varied scenes, surpassing single-modality and unidirectional fusion models by 10-20%. These results demonstrate that bidirectional fusion effectively integrates event and depth information, enabling more reliable obstacle avoidance in complex environments with both static and dynamic objects.