A Benchmark of Classical and Deep Learning Models for Agricultural Commodity Price Forecasting on A Novel Bangladeshi Market Price Dataset

arXiv cs.LG / 4/9/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces AgriPriceBD, a newly released Bangladesh commodity benchmark dataset containing 1,779 daily retail mid-prices for five commodities (July 2020–June 2025) digitized from government reports using an LLM-assisted pipeline.
  • It benchmarks seven short-term forecasting methods, ranging from classical approaches (persistence, SARIMA, Prophet) to deep learning models (BiLSTM, Transformer, Time2Vec-Transformer, Informer).
  • Results show commodity price predictability is highly heterogeneous: naive persistence works best for near-random-walk commodities, while Prophet underperforms due to step-like price dynamics violating its smoothness assumptions.
  • Time2Vec temporal encoding provides no significant improvement over fixed sinusoidal encodings and can catastrophically worsen performance for green chilli (+146.1% MAE, p<0.001).
  • Informer’s sparse-attention Transformer approach yields erratic forecasts with variance up to 50x the ground truth, suggesting such models need substantially larger training sets than available in small agricultural datasets, and the authors publicly release code, data, and models for replication.

Abstract

Accurate short-term forecasting of agricultural commodity prices is critical for food security planning and smallholder income stabilisation in developing economies, yet machine-learning-ready datasets for this purpose remain scarce in South Asia. This paper makes two contributions. First, we introduce AgriPriceBD, a benchmark dataset of 1,779 daily retail mid-prices for five Bangladeshi commodities - garlic, chickpea, green chilli, cucumber, and sweet pumpkin - spanning July 2020 to June 2025, extracted from government reports via an LLM-assisted digitisation pipeline. Second, we evaluate seven forecasting approaches spanning classical models - na\"{i}ve persistence, SARIMA, and Prophet - and deep learning architectures - BiLSTM, Transformer, Time2Vec-enhanced Transformer, and Informer - with Diebold-Mariano statistical significance tests. Commodity price forecastability is fundamentally heterogeneous: na\"{i}ve persistence dominates on near-random-walk commodities. Time2Vec temporal encoding provides no statistically significant advantage over fixed sinusoidal encoding and causes catastrophic degradation on green chilli (+146.1% MAE, p<0.001). Prophet fails systematically, attributable to discrete step-function price dynamics incompatible with its smooth decomposition assumptions. Informer produces erratic predictions (variance up to 50x ground-truth), confirming sparse-attention Transformers require substantially larger training sets than small agricultural datasets provide. All code, models, and data are released publicly to support replication and future forecasting research on agricultural commodity markets in Bangladesh and similar developing economies.