A Schr\"odinger Eigenfunction Method for Long-Horizon Stochastic Optimal Control

arXiv cs.LG / 3/25/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses the scalability problem in high-dimensional stochastic optimal control over long horizons, where many existing methods degrade in performance and scale poorly with the horizon length T.
  • For a specific class of linearly-solvable SOC problems where the uncontrolled drift is the gradient of a potential, the Hamilton–Jacobi–Bellman equation can be reduced to a linear PDE whose operator is shown to be unitarily equivalent to a Schrödinger operator with a purely discrete spectrum.
  • This spectral connection enables long-horizon control to be characterized efficiently via the eigensystem of the PDE operator, avoiding linear-in-T scaling.
  • For symmetric linear-quadratic regulator (LQR) problems, the corresponding Schrödinger operator matches a quantum harmonic oscillator Hamiltonian, yielding closed-form eigenfunctions/eigensystem that solve the symmetric LQR with arbitrary terminal cost.
  • In more general cases, the authors propose learning the eigensystem with neural networks, diagnose implicit reweighting issues in prior eigenfunction learning losses, introduce a new loss to mitigate them, and report order-of-magnitude improvements on long-horizon benchmarks along with reduced memory/runtime complexity from O(Td) to O(d).

Abstract

High-dimensional stochastic optimal control (SOC) becomes harder with longer planning horizons: existing methods scale linearly in the horizon T, with performance often deteriorating exponentially. We overcome these limitations for a subclass of linearly-solvable SOC problems-those whose uncontrolled drift is the gradient of a potential. In this setting, the Hamilton-Jacobi-Bellman equation reduces to a linear PDE governed by an operator \mathcal{L}. We prove that, under the gradient drift assumption, \mathcal{L} is unitarily equivalent to a Schr\"odinger operator \mathcal{S} = -\Delta + \mathcal{V} with purely discrete spectrum, allowing the long-horizon control to be efficiently described via the eigensystem of \mathcal{L}. This connection provides two key results: first, for a symmetric linear-quadratic regulator (LQR), \mathcal{S} matches the Hamiltonian of a quantum harmonic oscillator, whose closed-form eigensystem yields an analytic solution to the symmetric LQR with \emph{arbitrary} terminal cost. Second, in a more general setting, we learn the eigensystem of \mathcal{L} using neural networks. We identify implicit reweighting issues with existing eigenfunction learning losses that degrade performance in control tasks, and propose a novel loss function to mitigate this. We evaluate our method on several long-horizon benchmarks, achieving an order-of-magnitude improvement in control accuracy compared to state-of-the-art methods, while reducing memory usage and runtime complexity from \mathcal{O}(Td) to \mathcal{O}(d).