A Comparative Study of Machine Learning Models for Hourly Forecasting of Air Temperature and Relative Humidity

arXiv cs.LG / 3/25/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper compares seven machine learning approaches (XGBoost, Random Forest, SVR, MLP, Decision Tree, LSTM, and CNN-LSTM) for hourly forecasting of air temperature and relative humidity in a topographically complex urban setting (Chongqing, China).
  • It evaluates the models under a unified experimental pipeline that includes consistent preprocessing, lag-feature engineering, rolling statistical features, and time-series validation.
  • Results indicate XGBoost delivers the best overall accuracy, achieving test MAE of 0.302°C for temperature and 1.271% for relative humidity, with an average R2 of 0.989 across both tasks.
  • The study concludes that tree-based ensemble methods are particularly effective for structured meteorological time-series forecasting and offers guidance for building intelligent forecasting systems in mountainous cities.

Abstract

Accurate short-term forecasting of air temperature and relative humidity is critical for urban management, especially in topographically complex cities such as Chongqing, China. This study compares seven machine learning models: eXtreme Gradient Boosting (XGBoost), Random Forest, Support Vector Regression (SVR), Multi-Layer Perceptron (MLP), Decision Tree, Long Short-Term Memory (LSTM) networks, and Convolutional Neural Network (CNN)-LSTM (CNN-LSTM), for hourly prediction using real-world open data. Based on a unified framework of data preprocessing, lag-feature construction, rolling statistical features, and time-series validation, the models are systematically evaluated in terms of predictive accuracy and robustness. The results show that XGBoost achieves the best overall performance, with a test mean absolute error (MAE) of 0.302 {\deg}C for air temperature and 1.271% for relative humidity, together with an average R2 of 0.989 across the two forecasting tasks. These findings demonstrate the strong effectiveness of tree-based ensemble learning for structured meteorological time-series forecasting and provide practical guidance for intelligent meteorological forecasting in mountainous cities.