AI Navigate

COTONET: A custom cotton detection algorithm based on YOLO11 for stage of growth cotton boll detection

arXiv cs.CV / 3/13/2026

📰 NewsTools & Practical UsageModels & Research

Key Points

  • COTONET is a custom YOLOv11-based cotton boll detector enhanced with attention mechanisms to recognize bolls across different growth stages.
  • The architecture replaces standard convolutions with Squeeze-and-Excitation blocks, introduces an attention-enabled backbone, uses CARAFE for upsampling, and incorporates SimAM and PHAM for multi-level attention in the neck path.
  • It is designed for low-resource edge computing and mobile robotics, with 7.6M parameters and 27.8 GFLOPS.
  • The model outperforms standard YOLO baselines, achieving a mAP50 of 81.1% and a mAP50-95 of 60.6% on cotton boll detection tasks.

Abstract

Cotton harvesting is a critical phase where cotton capsules are physically manipulated and can lead to fibre degradation. To maintain the highest quality, harvesting methods must emulate delicate manual grasping, to preserve cotton's intrinsic properties. Automating this process requires systems capable of recognising cotton capsules across various phenological stages. To address this challenge, we propose COTONET, an enhanced custom YOLO11 model tailored with attention mechanisms to improve the detection of difficult instances. The architecture incorporates gradients in non-learnable operations to enhance shape and feature extraction. Key architectural modifications include: the replacement of convolutional blocks with Squeeze-and-Exitation blocks, a redesigned backbone integrating attention mechanisms, and the substitution of standard upsampling operations for Content Aware Reassembly of Features (CARAFE). Additionally, we integrate Simple Attention Modules (SimAM) for primary feature aggregation and Parallel Hybrid Attention Mechanisms (PHAM) for channel-wise, spatial-wise and coordinate-wise attention in the downward neck path. This configuration offers increased flexibility and robustness for interpreting the complexity of cotton crop growth. COTONET aligns with small-to-medium YOLO models utilizing 7.6M parameters and 27.8 GFLOPS, making it suitable for low-resource edge computing and mobile robotics. COTONET outperforms the standard YOLO baselines, achieving a mAP50 of 81.1% and a mAP50-95 of 60.6%.