WUTDet: A 100K-Scale Ship Detection Dataset and Benchmarks with Dense Small Objects

arXiv cs.CV / 4/10/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces WUTDet, a large-scale ship detection dataset with 100,576 images and 381,378 annotated ship instances designed to better cover small-object prevalence and diverse, challenging maritime imaging conditions.
  • WUTDet includes varied operational scenarios (e.g., ports, anchorages, navigation, berthing) and environmental effects such as fog, glare, low-light, and rain to support more robust detection evaluation.
  • Using WUTDet, the authors benchmark 20 baseline detectors across CNN, Transformer, and Mamba families, finding Transformers perform best overall and for small objects, while CNNs are more inference-efficient and Mamba offers a balance of accuracy and compute.
  • The authors also create Ship-GEN, a unified cross-dataset test set, showing that models trained on WUTDet generalize better across differing data distributions.
  • The dataset and benchmarks are publicly released via GitHub, enabling further research on ship detection and generalization in complex maritime scenes.

Abstract

Ship detection for navigation is a fundamental perception task in intelligent waterway transportation systems. However, existing public ship detection datasets remain limited in terms of scale, the proportion of small-object instances, and scene diversity, which hinders the systematic evaluation and generalization study of detection algorithms in complex maritime environments. To this end, we construct WUTDet, a large-scale ship detection dataset. WUTDet contains 100,576 images and 381,378 annotated ship instances, covering diverse operational scenarios such as ports, anchorages, navigation, and berthing, as well as various imaging conditions including fog, glare, low-lightness, and rain, thereby exhibiting substantial diversity and challenge. Based on WUTDet, we systematically evaluate 20 baseline models from three mainstream detection architectures, namely CNN, Transformer, and Mamba. Experimental results show that the Transformer architecture achieves superior overall detection accuracy (AP) and small-object detection performance (APs), demonstrating stronger adaptability to complex maritime scenes; the CNN architecture maintains an advantage in inference efficiency, making it more suitable for real-time applications; and the Mamba architecture achieves a favorable balance between detection accuracy and computational efficiency. Furthermore, we construct a unified cross-dataset test set, Ship-GEN, to evaluate model generalization. Results on Ship-GEN show that models trained on WUTDet exhibit stronger generalization under different data distributions. These findings demonstrate that WUTDet provides effective data support for the research, evaluation, and generalization analysis of ship detection algorithms in complex maritime scenarios. The dataset is publicly available at: https://github.com/MAPGroup/WUTDet.