Towards Safe Learning-Based Non-Linear Model Predictive Control through Recurrent Neural Network Modeling

arXiv cs.LG / 3/26/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a key bottleneck in nonlinear model predictive control (NMPC): online nonlinear program solving can be too expensive for embedded hardware at high control rates and with complex models or long horizons.
  • It proposes Sequential-AMPC, a learning-based NMPC approximation using a recurrent neural network policy that generates MPC candidate control sequences by sharing parameters across the prediction horizon.
  • To ensure deployability, the authors introduce Safe Sequential-AMPC, which wraps the learned policy with a safety-augmented online evaluation and fallback mechanism.
  • Experiments across multiple benchmarks show Sequential-AMPC needs substantially fewer expert MPC rollouts than typical learning-based approaches while producing candidate sequences with higher feasibility and better closed-loop safety.
  • On high-dimensional systems, the method demonstrates improved learning dynamics and performance with fewer epochs, while stable validation improvements persist even when a feedforward baseline stagnates.

Abstract

The practical deployment of nonlinear model predictive control (NMPC) is often limited by online computation: solving a nonlinear program at high control rates can be expensive on embedded hardware, especially when models are complex or horizons are long. Learning-based NMPC approximations shift this computation offline but typically demand large expert datasets and costly training. We propose Sequential-AMPC, a sequential neural policy that generates MPC candidate control sequences by sharing parameters across the prediction horizon. For deployment, we wrap the policy in a safety-augmented online evaluation and fallback mechanism, yielding Safe Sequential-AMPC. Compared to a naive feedforward policy baseline across several benchmarks, Sequential-AMPC requires substantially fewer expert MPC rollouts and yields candidate sequences with higher feasibility rates and improved closed-loop safety. On high-dimensional systems, it also exhibits better learning dynamics and performance in fewer epochs while maintaining stable validation improvement where the feedforward baseline can stagnate.