DenseSwinV2: Channel Attentive Dual Branch CNN Transformer Learning for Cassava Leaf Disease Classification
arXiv cs.AI / 3/30/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces DenseSwinV2, a hybrid two-branch CNN–Transformer framework that combines DenseNet-style dense local feature learning with customized Swin Transformer V2 global contextual modeling for cassava leaf disease classification.
- It uses shifted-window self-attention to capture long-range dependencies that help distinguish visually similar lesions, addressing challenges like occlusion, noise, and complex backgrounds.
- Independent channel-squeeze attention modules are applied to each stream to amplify discriminative disease-related responses while suppressing redundant or background activations.
- On a public cassava dataset with 31,000 images across five conditions (including normal), DenseSwinV2 reports 98.02% classification accuracy and an F1 score of 97.81%, outperforming established CNN and transformer baselines.
- The results suggest the approach is robust and practical for field-level agricultural diagnosis where image quality is variable.




