GIFT: Global stabilisation via Intrinsic Fine Tuning

arXiv cs.LG / 4/28/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces GIFT (Global stabilisation via Intrinsic Fine Tuning), a general-purpose training framework that improves global stability of already strong deep reinforcement learning policies.
  • GIFT directly optimizes global stability by using a custom reward function, aiming to reduce chaotic state dynamics and the high sensitivity to initial conditions common in Deep RL.
  • Experiments show that applying GIFT increases stability of the control interaction while keeping task performance comparable to the original policies.
  • The work targets a key limitation of Deep RL for real-world control, where stability and performance guarantees are often necessary rather than only average task success.

Abstract

Deep reinforcement learning policies achieve strong performance in complex continuous control environments with nonlinear contact forces. However, these policies often produce chaotic state dynamics, with trivially small changes to the initial conditions significantly impacting the long-term behaviour of the control system. This high sensitivity to initial conditions limits the application of Deep RL to real-world control systems where performance and stability guarantees are often required. To address this issue, we propose Global stabilisation via Intrinsic Fine Tuning (GIFT), a general-purpose training framework which directly optimises the global stability of existing high-performing deep RL policies using a custom reward function. We demonstrate that GIFT increase the stability of the control interaction while maintaining comparable task performance, thereby improving the suitability of deep RL policies for real-world control systems.