AI Navigate

A Grid-Based Framework for E-Scooter Demand Representation and Temporal Input Design for Deep Learning: Evidence from Austin, Texas

arXiv cs.CV / 3/17/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper presents a reproducible data-processing pipeline that converts Austin e-scooter trip records into hourly grid-based demand images for next-hour and next-24-hour forecasting.
  • It proposes a statistically grounded method to design temporal input structures using correlation- and error-based selection, supported by ablation studies and Holm-corrected non-parametric tests.
  • The results show the optimized temporal design captures short-term persistence and daily/weekly cycles, outperforming baselines with up to 37% MSE reduction for next-hour and 35% for next-24-hour predictions.
  • A global activity mask and Census Tract mapping focus evaluation on historically active areas, promoting consistent spatial learning without bias from inactive regions.
  • The study underscores the importance of principled dataset construction and validated temporal inputs for spatiotemporal micromobility demand prediction, with implications for ML research and urban mobility applications.

Abstract

Despite progress in deep learning for shared micromobility demand prediction, the systematic design and statistical validation of temporal input structures remain underexplored. Temporal features are often selected heuristically, even though historical demand strongly affects model performance and generalizability. This paper introduces a reproducible data-processing pipeline and a statistically grounded method for designing temporal input structures for image-to-image demand prediction. Using large-scale e-scooter data from Austin, Texas, we build a grid-based spatiotemporal dataset by converting trip records into hourly pickup and dropoff demand images. The pipeline includes trip filtering, mapping Census Tracts to spatial locations, grid construction, demand aggregation, and creation of a global activity mask that limits evaluation to historically active areas. This representation supports consistent spatial learning while preserving demand patterns. We then introduce a combined correlation- and error-based procedure to identify informative historical inputs. Optimal temporal depth is selected through an ablation study using a baseline UNET model with paired non-parametric tests and Holm correction. The resulting temporal structures capture short-term persistence as well as daily and weekly cycles. Compared with adjacent-hour and fixed-period baselines, the proposed design reduces mean squared error by up to 37 percent for next-hour prediction and 35 percent for next-24-hour prediction. These results highlight the value of principled dataset construction and statistically validated temporal input design for spatiotemporal micromobility demand prediction.