Robust Lightweight Crack Classification for Real-Time UAV Bridge Inspection

arXiv cs.CV / 5/1/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research

Key Points

  • The study presents a lightweight deep learning framework for real-time UAV bridge crack classification, targeting weak crack features, degraded imaging, severe class imbalance, and limited on-board compute.
  • It combines a lightweight CNN backbone with a CBAM attention module, a directed robust augmentation strategy informed by inspection-scene priors, and Focal Loss to better learn hard samples.
  • On the SDNET2018 bridge deck dataset, the method reaches 825 FPS while using just 11.21M parameters and 1.82G FLOPs.
  • Compared with a baseline model, the full framework improves F1-score by 2.51% and recall by 3.95%, and Grad-CAM suggests attention shifts toward tracking crack trajectories.
  • The authors provide an implementation at the linked GitHub repository to support practical deployment for ground-station assisted UAV inspections.

Abstract

With the widespread application of Unmanned Aerial Vehicles (UAVs) in bridge structural health monitoring, deep learning-based automatic crack detection has become a major research focus. However, practical UAV inspections still face four key challenges: weak crack features, degraded imaging conditions, severe class imbalance, and limited computational resources for practical UAV inspection workflows. To address these issues, this paper proposes a unified lightweight convolutional neural network framework composed of four synergistic components: a lightweight backbone network, a Convolutional Block Attention Module (CBAM) for channel and spatial enhancement, a directed robust augmentation strategy based on inspection-scene priors, and Focal Loss for hard-sample learning under class imbalance. Experiments on the SDNET2018 bridge deck dataset show that the proposed method achieves an inference speed of 825 FPS with only 11.21M parameters and 1.82G FLOPs. Compared with the baseline model, the complete framework improves the F1-score by 2.51% and recall by 3.95%. In addition, Grad-CAM visualizations indicate that the introduced attention module shifts the model's focus from scattered regions to precise tracking along crack trajectories. Overall, this study achieves a strong balance among accuracy, speed, and robustness, providing a practical solution for ground-station assisted real-time deployment in UAV bridge inspections. The source code is available at: https://github.com/skylynf/AttXNet .