Similarity-Based Bike Station Expansion via Hybrid Denoising Autoencoders

arXiv cs.LG / 4/20/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The study proposes a data-driven framework for expanding urban bike-sharing stations by learning from existing, operationally “desirable” stations rather than relying on explicit demand modeling.
  • It introduces a hybrid denoising autoencoder (HDAE) that learns compact latent embeddings from multi-source grid features (socio-demographics, built environment, and transport networks), with a supervised classification head to regularize the embedding space.
  • Expansion candidates are chosen using a greedy allocation approach with spatial constraints, where candidates are prioritized by similarity in the learned latent space to existing stations.
  • Experiments on Trondheim’s bike-sharing network show that HDAE-based embeddings produce more spatially coherent clusters and allocation patterns than embeddings derived from raw features, and robustness is supported through sensitivity analyses.
  • To improve recommendation reliability, the authors use a consensus procedure across multiple model parametrizations, extracting 32 high-confidence extension zones where all parametrizations agree, and argue the method generalizes to other location-allocation problems.

Abstract

Urban bike-sharing systems require strategic station expansion to meet growing demand. Traditional allocation approaches rely on explicit demand modelling that may not capture the urban characteristics distinguishing successful stations. This study addresses the need to exploit patterns from existing stations to inform expansion decisions, particularly in data-constrained environments. We present a data-driven framework leveraging existing stations deemed desirable by operational metrics. A hybrid denoising autoencoder (HDAE) learns compressed latent representations from multi-source grid-level features (socio-demographic, built environment, and transport network), with a supervised classification head regularising the embedding space structure. Expansion candidates are selected via greedy allocation with spatial constraints based on latent-space similarity to existing stations. Evaluation on Trondheim's bike-sharing network demonstrates that HDAE embeddings yield more spatially coherent clusters and allocation patterns than raw features. Sensitivity analyses across similarity methods and distance metrics confirm robustness. A consensus-based procedure across multiple parametrisations distils 32 high-confidence extension zones where all parametrisations agree. The results demonstrate how representation learning captures complex patterns that raw features miss, enabling evidence-based expansion planning without explicit demand modelling. The consensus procedure strengthens recommendations by requiring agreement across parametrisations, while framework configurability allows planners to incorporate operational knowledge. The methodology generalises to any location-allocation problem where existing desirable instances inform the selection of new candidates.