Tiny-ViT: A Compact Vision Transformer for Efficient and Explainable Potato Leaf Disease Classification

arXiv cs.CV / 3/31/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes “Tiny-ViT,” a compact Vision Transformer designed for efficient and explainable potato leaf disease classification in resource-limited settings.
  • It targets three classes—Early Blight, Late Blight, and Healthy leaves—using preprocessing steps including resizing, CLAHE, and Gaussian blur to improve image quality.
  • Reported performance is extremely high, with test accuracy of 99.85% and mean cross-validation accuracy of 99.82%, outperforming baseline models such as DeiT Small, Swin Tiny, and MobileViT XS.
  • The model shows strong reliability and generalization, indicated by a Matthews Correlation Coefficient (MCC) of 0.9990 and very narrow confidence intervals.
  • Explainability is enhanced via GRAD-CAM, which highlights diseased regions, and the approach is positioned as suitable for real-time inference due to low computational cost.

Abstract

Early and precise identification of plant diseases, especially in potato crops is important to ensure the health of the crops and ensure the maximum yield . Potato leaf diseases, such as Early Blight and Late Blight, pose significant challenges to farmers, often resulting in yield losses and increased pesticide use. Traditional methods of detection are not only time-consuming, but are also subject to human error, which is why automated and efficient methods are required. The paper introduces a new method of potato leaf disease classification Tiny-ViT model, which is a small and effective Vision Transformer (ViT) developed to be used in resource-limited systems. The model is tested on a dataset of three classes, namely Early Blight, Late Blight, and Healthy leaves, and the preprocessing procedures include resizing, CLAHE, and Gaussian blur to improve the quality of the image. Tiny-ViT model has an impressive test accuracy of 99.85% and a mean CV accuracy of 99.82% which is better than baseline models such as DEIT Small, SWIN Tiny, and MobileViT XS. In addition to this, the model has a Matthews Correlation Coefficient (MCC) of 0.9990 and narrow confidence intervals (CI) of [0.9980, 0.9995], which indicates high reliability and generalization. The training and testing inference time is competitive, and the model exhibits low computational expenses, thereby, making it applicable in real-time applications. Moreover, interpretability of the model is improved with the help of GRAD-CAM, which identifies diseased areas. Altogether, the proposed Tiny-ViT is a solution with a high level of robustness, efficiency, and explainability to the problem of plant disease classification.

Tiny-ViT: A Compact Vision Transformer for Efficient and Explainable Potato Leaf Disease Classification | AI Navigate