Physics-Guided Dimension Reduction for Simulation-Free Operator Learning of Stiff Differential--Algebraic Systems

arXiv cs.LG / 4/23/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper tackles two main issues in neural-surrogate operator learning for stiff DAEs: soft constraints leave algebraic residuals that blow up under stiffness, while hard constraints require data from costly stiff solvers.
  • It proposes an extended Newton implicit layer that enforces algebraic consistency and quasi-steady-state (fast-state) reduction inside one differentiable solve, using slow-state predictions from a physics-informed DeepONet.
  • By applying the implicit function theorem, the method introduces a stiffness-scaled coupling term that is not present in penalty-based training, improving robustness against stiffness amplification.
  • The approach reduces output to slow states only and extends via cascaded implicit layers to multi-component systems with provable convergence; experiments on a grid-forming inverter DAE show large accuracy gains and lower algebraic residual.
  • The work also demonstrates composability (two independently trained models assemble into a larger 44-state system without retraining) and uses conformal prediction for 90% in-distribution coverage plus automatic out-of-distribution detection.

Abstract

Neural surrogates for stiff differential-algebraic equations (DAEs) face two key challenges: soft-constraint methods leave algebraic residuals that stiffness amplifies into large errors, while hard-constraint methods require trajectory data from computationally expensive stiff integrators. We introduce an extended Newton implicit layer that enforces algebraic consistency and quasi-steady-state reduction within a single differentiable solve. Given slow-state predictions from a physics-informed DeepONet, the proposed layer recovers fast and algebraic states, eliminates the stiffness-amplification pathway within each time window, and reduces the output dimension to the slow states alone. Gradients derived via the implicit function theorem capture a stiffness-scaled coupling term that is absent in penalty-based approaches. Cascaded implicit layers further extend the framework to multi-component systems with provable convergence. On a grid-forming inverter DAE (21 states), the proposed method (7 outputs, 1.42 percent error) significantly outperforms penalty methods (39.3 percent), standard Newton approaches (57.0 percent), and augmented Lagrangian or feedback linearization baselines, which fail to converge. Two independently trained models compose into a 44-state system without retraining, achieving 0.72 to 1.16 percent error with zero algebraic residual. Conformal prediction further provides 90 percent coverage in-distribution and enables automatic out-of-distribution detection.