Physics-Guided Transformer (PGT): Physics-Aware Attention Mechanism for PINNs

arXiv cs.LG / 3/31/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes Physics-Guided Transformer (PGT), a physics-aware neural architecture that injects PDE structure directly into the self-attention mechanism rather than using soft penalty terms as in many physics-informed methods.
  • PGT adds a heat-kernel-derived bias to attention logits to encode diffusion dynamics and temporal causality, enabling query coordinates to attend to physics-conditioned context tokens.
  • The model’s decoding uses a FiLM-modulated sinusoidal implicit network to adaptively control spectral response, targeting more stable and physically consistent reconstructions.
  • Experiments on the 1D heat equation and 2D incompressible Navier–Stokes show markedly improved performance in data-scarce settings, including a reported 1D relative L2 error of 5.9e-3 with 100 observations.
  • On the 2D cylinder wake problem, PGT is reported to achieve both low PDE residual and competitive reconstruction error, outperforming approaches that optimize only one objective.

Abstract

Reconstructing continuous physical fields from sparse, irregular observations is a central challenge in scientific machine learning, particularly for systems governed by partial differential equations (PDEs). Existing physics-informed methods typically enforce governing equations as soft penalty terms during optimization, often leading to gradient imbalance, instability, and degraded physical consistency under limited data. We introduce the Physics-Guided Transformer (PGT), a neural architecture that embeds physical structure directly into the self-attention mechanism. Specifically, PGT incorporates a heat-kernel-derived additive bias into attention logits, encoding diffusion dynamics and temporal causality within the representation. Query coordinates attend to these physics-conditioned context tokens, and the resulting features are decoded using a FiLM-modulated sinusoidal implicit network that adaptively controls spectral response. We evaluate PGT on the one-dimensional heat equation and two-dimensional incompressible Navier-Stokes systems. In sparse 1D reconstruction with 100 observations, PGT achieves a relative L2 error of 5.9e-3, significantly outperforming both PINNs and sinusoidal representations. In the 2D cylinder wake problem, PGT uniquely achieves both low PDE residual (8.3e-4) and competitive relative error (0.034), outperforming methods that optimize only one objective. These results demonstrate that embedding physics within attention improves stability, generalization, and physical fidelity under data-scarce conditions.