PI-JEPA: Label-Free Surrogate Pretraining for Coupled Multiphysics Simulation via Operator-Split Latent Prediction

arXiv cs.LG / 4/3/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces PI-JEPA, a label-free surrogate pretraining method for coupled multiphysics reservoir simulation that leverages unlabeled parameter fields instead of requiring large labeled simulation datasets.
  • PI-JEPA trains by masked latent prediction without completing PDE solves, using per-sub-operator PDE residual regularization to enforce physics during pretraining.
  • The architecture uses a predictor bank aligned with Lie–Trotter operator-splitting, allocating separate physics-constrained latent modules for sub-processes such as pressure, saturation transport, and reaction.
  • Experiments on single-phase Darcy flow show PI-JEPA achieves substantially lower error than FNO and DeepONet at only 100 labeled runs, and improves over supervised-only training even when more labels (500) are available.
  • The results suggest physics-informed, operator-split latent pretraining can materially reduce the simulation budget needed to deploy neural operator surrogates for multiphysics problems.

Abstract

Reservoir simulation workflows face a fundamental data asymmetry: input parameter fields (geostatistical permeability realizations, porosity distributions) are free to generate in arbitrary quantities, yet existing neural operator surrogates require large corpora of expensive labeled simulation trajectories and cannot exploit this unlabeled structure. We introduce \textbf{PI-JEPA} (Physics-Informed Joint Embedding Predictive Architecture), a surrogate pretraining framework that trains \emph{without any completed PDE solves}, using masked latent prediction on unlabeled parameter fields under per-sub-operator PDE residual regularization. The predictor bank is structurally aligned with the Lie--Trotter operator-splitting decomposition of the governing equations, dedicating a separate physics-constrained latent module to each sub-process (pressure, saturation transport, reaction), enabling fine-tuning with as few as 100 labeled simulation runs. On single-phase Darcy flow, PI-JEPA achieves 1.9\times lower error than FNO and 2.4\times lower error than DeepONet at N_\ell{=}100, with 24\% improvement over supervised-only training at N_\ell{=}500, demonstrating that label-free surrogate pretraining substantially reduces the simulation budget required for multiphysics surrogate deployment.