FedLLM: A Privacy-Preserving Federated Large Language Model for Explainable Traffic Flow Prediction

arXiv cs.LG / 4/21/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces FedLLM, a privacy-preserving federated framework for explainable multi-horizon short-term traffic flow prediction (15–60 minutes) aimed at real-time ITS decision-making.
  • It addresses limits of prior spatio-temporal and LLM-based methods by moving away from centralized training and incorporating structured, context-rich representations for better explainability.
  • FedLLM contributes a Composite Selection Score (CSS) to choose freeways based on structural diversity, and a domain-adapted LLM fine-tuned on structured traffic prompts (spatial, temporal, and statistical context).
  • The federated training setup enables collaboration across heterogeneous clients by exchanging only lightweight LoRA adapter parameters, reducing communication overhead and supporting learning under non-IID traffic.
  • Experiments report improved predictive accuracy versus centralized baselines while generating structured, explainable outputs, suggesting federated learning can scale privacy-aware traffic forecasting with LLM reasoning.

Abstract

Traffic prediction plays a central role in intelligent transportation systems (ITS) by supporting real-time decision-making, congestion management, and long-term planning. However, many existing approaches face practical limitations. Most spatio-temporal models are trained on centralized data, rely on numerical representations, and offer limited explainability. Recent Large Language Model (LLM) methods improve reasoning capabilities but typically assume centralized data availability and do not fully capture the distributed and heterogeneous nature of real-world traffic systems. To address these challenges, this study proposes FedLLM (Federated LLM), a privacy-preserving and distributed framework for explainable multi-horizon short-term traffic flow prediction (15-60 minutes). The framework introduces four key contributions: 1) a Composite Selection Score (CSS) for data-driven freeway selection that captures structural diversity across traffic regions 2) a domain-adapted LLM fine-tuned on structured traffic prompts encoding spatial, temporal, and statistical context 3) FedLLM framework, that enables collaborative training across heterogeneous clients while exchanging only lightweight LoRA adapter parameters, 4) a structured prompt representation that supports contextual reasoning and cross-region generalization. The FedLLM design allows each client to learn from local traffic patterns while contributing to a shared global model through efficient parameter exchange, reducing communication overhead and keeping data private. This setup supports learning under non-IID traffic distributions. Experimental results show that FedLLM achieves improved predictive performance over centralized baselines, while producing structured and explainable outputs. These findings highlight the potential of combining FL with domain-adapted LLMs for scalable, privacy-aware, and explainable traffic prediction.