Mitigating Data Scarcity in Spaceflight Applications for Offline Reinforcement Learning Using Physics-Informed Deep Generative Models
arXiv cs.LG / 4/6/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper targets the simulation-to-reality (sim-to-real) gap in reinforcement learning (RL) controllers for spaceflight, where real-world training data are extremely scarce.
- It proposes MI-VAE, a physics-informed variational autoencoder that injects physics-based learning bias by modeling the discrepancy between observed trajectories and physics model predictions.
- The MI-VAE’s latent space is used to generate synthetic trajectory datasets that better respect physical constraints for offline RL training.
- In a planetary lander benchmark with limited real-world data, augmenting offline RL datasets with MI-VAE-generated samples improves RL performance and policy success rate compared with standard VAE-based augmentation.
- Overall, the work offers a scalable approach to improving autonomous controller robustness in data-constrained, physics-dominated environments like space missions.
