A Comparative Study of Modern Object Detectors for Robust Apple Detection in Orchard Imagery

arXiv cs.CV / 4/14/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses robust single-class apple detection in orchard imagery by accounting for challenging conditions such as illumination changes, leaf clutter, dense fruit clusters, and partial occlusion.
  • It introduces a controlled and reproducible benchmark on the public AppleBBCH81 dataset with one fixed train/validation/test split and a unified evaluation protocol across six detectors (YOLOv10n, YOLO11n, RT-DETR-L, Faster R-CNN, FCOS, and SSDLite320).
  • Using COCO-style metrics (mAP@0.5 and mAP@0.5:0.95), YOLO11n performs best on strict localization on the validation split (mAP@0.5:0.95 = 0.6065; mAP@0.5 = 0.9620).
  • The study also shows that threshold-dependent behavior matters for deployment: at a low-confidence operating point (confidence >= 0.05), YOLOv10n achieves the highest F1-score, while RT-DETR-L shows high recall but many false positives (low precision).
  • Overall, the results recommend selecting detectors based not only on localization accuracy but also on threshold robustness to match downstream requirements like counting, yield prediction, or robotic harvesting.

Abstract

Accurate apple detection in orchard images is important for yield prediction, fruit counting, robotic harvesting, and crop monitoring. However, changing illumination, leaf clutter, dense fruit clusters, and partial occlusion make detection difficult. To provide a fair and reproducible comparison, this study establishes a controlled benchmark for single-class apple detection on the public AppleBBCH81 dataset using one deterministic train, validation, and test split and a unified evaluation protocol across six representative detectors: YOLOv10n, YOLO11n, RT-DETR-L, Faster R-CNN (ResNet50-FPN), FCOS (ResNet50-FPN), and SSDLite320 (MobileNetV3-Large). Performance is evaluated primarily using COCO-style mAP@0.5 and mAP@0.5:0.95, and threshold-dependent behavior is further analyzed using precision-recall curves and fixed-threshold precision, recall, and F1-score at IoU = 0.5. On the validation split, YOLO11n achieves the best strict localization performance with mAP@0.5:0.95 = 0.6065 and mAP@0.5 = 0.9620, followed closely by RT-DETR-L and YOLOv10n. At a fixed operating point with confidence >= 0.05, YOLOv10n attains the highest F1-score, whereas RT-DETR-L achieves very high recall but low precision because of many false positives at low confidence. These findings show that detector selection for orchard deployment should be guided not only by localization-aware accuracy but also by threshold robustness and the requirements of the downstream task.