An Explainable Ensemble Learning Framework for Crop Classification with Optimized Feature Pyramids and Deep Networks

arXiv cs.LG / 3/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • 論文は、土壌(pH、窒素、カリウムなど)と気候(気温、降雨など)を用いた作物分類/適地予測のために、最適化された特徴ピラミッド、深層ネットワーク、自注意機構、残差ネットワークを組み合わせた説明可能なアンサンブル学習枠組みを提案しています。
  • エチオピアの農業データ(3,867インスタンス、29特徴)とNASAデータを用い、ラベルエンコーディング、IQRによる外れ値除去、StandardScalerによる正規化、SMOTEによるクラス不均衡の補正などの前処理を実施しています。
  • Logistic Regression〜Random Forest〜Gradient Boosting等の複数モデルに加え、「Relative Error Support Vector Machine」も比較し、Grid Searchと交差検証でハイパーパラメータを調整しています。
  • 提案する「Final Ensemble(メタアンサンブル)」は単体モデルより高く、98.80%の精度などを報告しており、SHAPや置換重要度(permutation importance)で土壌pH、窒素、亜鉛などの重要特徴を可視化して意思決定に繋げています。

Abstract

Agriculture is increasingly challenged by climate change, soil degradation, and resource depletion, and hence requires advanced data-driven crop classification and recommendation solutions. This work presents an explainable ensemble learning paradigm that fuses optimized feature pyramids, deep networks, self-attention mechanisms, and residual networks for bolstering crop suitability predictions based on soil characteristics (e.g., pH, nitrogen, potassium) and climatic conditions (e.g., temperature, rainfall). With a dataset comprising 3,867 instances and 29 features from the Ethiopian Agricultural Transformation Agency and NASA, the paradigm leverages preprocessing methods such as label encoding, outlier removal using IQR, normalization through StandardScaler, and SMOTE for balancing classes. A range of machine learning models such as Logistic Regression, K-Nearest Neighbors, Support Vector Machines, Decision Trees, Random Forest, Gradient Boosting, and a new Relative Error Support Vector Machine are compared, with hyperparameter tuning through Grid Search and cross-validation. The suggested "Final Ensemble" meta-ensemble design outperforms with 98.80% accuracy, precision, recall, and F1-score, compared to individual models such as K-Nearest Neighbors (95.56% accuracy). Explainable AI methods, such as SHAP and permutation importance, offer actionable insights, highlighting critical features such as soil pH, nitrogen, and zinc. The paradigm addresses the gap between intricate ML models and actionable agricultural decision-making, fostering sustainability and trust in AI-powered recommendations