NCO4CVRP: Neural Combinatorial Optimization for the Capacitated Vehicle Routing Problem

arXiv cs.LG / 4/21/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research

Key Points

  • The paper introduces improvements to Neural Combinatorial Optimization (NCO) inference for the Capacitated Vehicle Routing Problem (CVRP), aiming to raise solution quality and generalization.
  • It modifies the LEHD model’s Random Re-Construct (RRC) method by integrating Simulated Annealing (SA), using probabilistic acceptance to escape local optima and diversify search.
  • It upgrades POMO by adding Beam Search to systematically explore multiple high-potential solutions while preserving diversity.
  • The study compares multiple inference strategies (Softmax Sampling, Greedy, Gumbel-Softmax, Epsilon-Greedy) and evaluates instance augmentation via flipping and rotation to improve generalization.
  • Experiments across CVRP benchmarks show that SA-based RRC and Beam Search consistently reduce the optimality gap, improving applicability of NCO to real-world routing.

Abstract

Neural Combinatorial Optimization (NCO) has emerged as a powerful framework for solving combinatorial optimization problems by integrating deep learning-based models. This work focuses on improving existing inference techniques to enhance solution quality and generalization. Specifically, we modify the Random Re-Construct (RRC) approach of the Light Encoder Heavy Decoder (LEHD) model by incorporating Simulated Annealing (SA). Unlike the conventional RRC, which greedily replaces suboptimal segments, our SA-based modification introduces a probabilistic acceptance mechanism that allows the model to escape local optima and explore a more diverse solution space. Additionally, we enhance the Policy Optimization with Multiple Optima (POMO) approach by integrating Beam Search, enabling systematic exploration of multiple promising solutions while maintaining diversity in the search space. We further investigate different inference strategies, including Softmax Sampling, Greedy, Gumbel-Softmax, and Epsilon-Greedy, analyzing their impact on solution quality. Furthermore, we explore instance augmentation techniques, such as horizontal and vertical flipping and rotation-based augmentations, to improve model generalization across different CVRP instances. Our extensive experiments demonstrate that these modifications significantly reduce the optimality gap across various Capacitated Vehicle Routing Problem (CVRP) benchmarks, with Beam Search and SA-based RRC consistently yielding superior performance. By refining inference techniques and leveraging enhanced search strategies, our work contributes to the broader applicability of NCO models in real-world combinatorial optimization tasks.