Driving Style Recognition Like an Expert Using Semantic Privileged Information from Large Language Models

arXiv cs.RO / 5/6/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that existing driving style recognition models rely too heavily on low-level sensor features and miss the semantic reasoning humans use when judging driving behavior.
  • It proposes a framework using Semantic Privileged Information (SPI) from large language models (LLMs) to better align algorithmic classifications with human-interpretable judgments.
  • The approach introduces DriBehavGPT, which generates natural-language descriptions of driving behaviors, then converts them into embeddings and reduced-dimensional representations for model training.
  • SPI is injected into a Support Vector Machine Plus (SVM+) training pipeline to approximate human-like interpretation patterns, while inference remains sensor-only for efficiency.
  • Experiments on varied real-world scenarios show improved performance, with F1-score gains of 7.6% for car-following and 7.9% for lane-changing over conventional methods.

Abstract

Existing driving style recognition systems largely depend on low-level sensor-derived features for training, neglecting the rich semantic reasoning capability inherent to human experts. This discrepancy results in a fundamental misalignment between algorithmic classifications and expert judgments. To bridge this gap, we propose a novel framework that integrates Semantic Privileged Information (SPI) derived from large language models (LLMs) to align recognition outcomes with human-interpretable reasoning. First, we introduce DriBehavGPT, an interactive LLM-based module that generates natural-language descriptions of driving behaviors. These descriptions are then encoded into machine learning-compatible representations via text embedding and dimensionality reduction. Finally, we incorporate them as privileged information into Support Vector Machine Plus (SVM+) for training, enabling the model to approximate human-like interpretation patterns. Experiments across diverse real-world driving scenarios demonstrate that our SPI-enhanced framework outperforms conventional methods, achieving F1-score improvements of 7.6% (car-following) and 7.9% (lane-changing). Importantly, SPI is exclusively used during training, while inference relies solely on sensor data, ensuring computational efficiency without sacrificing performance. These results highlight the pivotal role of semantic behavioral representations in improving recognition accuracy while advancing interpretable, human-centric driving systems.