Uncertainty-Aware Trip Purpose Inference from GPS Trajectories via POI Semantic Zones and Pareto Calibration

arXiv cs.AI / 5/5/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses the difficulty of inferring trip purposes for detected stops from large-scale GPS trajectories when individual ground-truth labels are unavailable and GPS/POI data are uncertain or incomplete.
  • It introduces a weakly supervised method that uses POI semantic zones combined with distance-weighted spatial likelihoods, with different inference treatments for mandatory versus non-mandatory activities.
  • The approach includes a multi-phase Pareto calibration that balances matching household travel survey statistics (via distributional divergence minimization) with improving inference reliability (without annotated training labels).
  • On a dataset of 81M+ staypoints in Los Angeles, the method improves alignment with expected activity distributions, reducing Jensen–Shannon distance for activity frequency (23%), start times (48%), and durations (12%) versus a baseline.

Abstract

Large-scale GPS trajectory data offer rich observations of human mobility, yet assigning trip purposes to detected stops remains challenging due to the absence of individual-level ground truth, spatial uncertainty from GPS noise and incomplete points of interest (POIs) coverage, and fundamental behavioral differences across trip purposes. We propose a weakly supervised framework integrating neighborhood-level POI semantic zones with distance-weighted spatial likelihoods, differentiated inference strategies for mandatory and non-mandatory activities, and a multi-phase Pareto optimization that jointly minimizes distributional divergence from household travel survey statistics and maximizes inference reliability without requiring annotated labels. Evaluated on over 81 million staypoints in Los Angeles, the framework reduces activity type frequency Jensen-Shannon distance (JSD) by 23%, start time JSD by 48%, and duration JSD by 12% respectively relative to a comparable baseline. The proposed approach provides a scalable and uncertainty-aware path from raw GPS trajectories to semantically annotated mobility data for travel demand modeling and transportation policy analysis.