Dreaming Across Towns: Semantic Rollout and Town-Adversarial Regularization for Zero-Shot Held-Out-Town Fixed-Route Driving in CARLA

arXiv cs.RO / 5/1/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper investigates zero-shot generalization for an autonomous driving agent in CARLA, transferring a fixed-route driving policy trained on Town05/Town06 to unseen Town03/Town04 under controlled weather and no traffic/pedestrians.
  • It builds a Dreamer-style latent world-model agent and introduces two auxiliary training losses: multi-horizon prediction of future visual-semantic embeddings during imagined rollouts, and town-adversarial regularization on a semantic projection of the recurrent latent state.
  • A causal context feature is used to condition the semantic rollout predictor, while the actor/critic keep standard control features, and the policy receives no navigation or map-related inputs (the route is only used by the simulator for rewards/termination).
  • Experiments show that the proposed method achieves the highest mean success rate on the held-out towns among the compared Dreamer-family approaches, though safety and lane-keeping results vary by town.
  • Overall, the authors conclude that, within this bounded CARLA setting, semantic rollout supervision combined with town-adversarial regularization improves fixed-route route completion in unseen towns.

Abstract

Learned driving agents often degrade when deployed in unseen environments. This paper studies a deliberately bounded instance of that problem in the CARLA simulator: zero-shot transfer of a closed-loop fixed-route driving agent from Town05 and Town06 to unseen Town03 and Town04. The study isolates structural town shift by keeping weather fixed to ClearNoon and removing traffic and pedestrians. We build on a Dreamer-style latent world-model agent and add two training-only auxiliary losses: multi-horizon prediction of future visual-semantic embeddings along imagined rollouts and town-adversarial supervision on a semantic projection of the recurrent latent state. A causal context feature conditions the semantic rollout predictor, while the actor and critic retain the standard control feature. The policy receives no navigation command, route polyline, goal pose, or map input; the reference route is used only by the environment for reward, progress, success, and termination. Across the evaluated held-out towns, the proposed model achieves the highest mean success rate among the included Dreamer-family methods. Secondary safety and lane-keeping metrics are mixed across towns. These results support a bounded conclusion: in this controlled fixed-weather CARLA setting, semantic rollout supervision combined with town-adversarial regularization improves mean held-out-town route completion.