Sparse Weak-Form Discovery of Stochastic Generators

arXiv stat.ML / 3/24/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes a unified framework for identifying stochastic differential equations (SDEs) by combining weak-form integration-by-parts (Weak SINDy) with stochastic system identification (stochastic SINDy).
Its key innovation is using spatial Gaussian test functions instead of temporal ones, which ensures unbiased projected response noise by making each noise term have zero conditional mean given the current state.
The method reformulates SDE discovery into two sparse linear systems—one for drift and one for the diffusion tensor—solved jointly with shared design matrix via ℓ1-regularized regression and grouped cross-validation.
It includes a two-step bias-correction procedure to handle state-dependent diffusion, improving robustness when the diffusion varies with the state.
Experiments on benchmarks (Ornstein–Uhlenbeck, double-well Langevin, and multiplicative diffusion) report accurate recovery of generators with small coefficient errors (<4%), low stationary-density divergence (<0.01 TV distance), and correct relaxation timescales in autocorrelations.

Abstract

We introduce a framework for the data-driven discovery of stochastic differential equations (SDEs) that unifies, for the first time, the weak-form integration-by-parts approach of Weak SINDy with the stochastic system identification goal of stochastic SINDy. The central novelty is the adoption of spatial Gaussian test functions

K_j(x)=\exp(-|x-x_j|^2/2h^2)

in place of temporal test functions. Because the kernel weight

K_j(X_{t_n})

\mathcal{F}_{t_n}

-measurable and the Brownian innovation

\xi_n

is independent of

\mathcal{F}_{t_n}

, every noise term in the projected response has zero conditional mean given the current state -- a property that guarantees unbiasedness in expectation and prevents the structural regression bias that afflicts temporal test functions in the stochastic setting. This design choice converts the SDE identification problem into two sparse linear systems -- one for the drift

b(x)

and one for the diffusion tensor

a(x)

-- that share a single design matrix and are solved jointly via

\ell_1

-regularised regression with grouped cross-validation. A two-step bias-correction procedure handles state-dependent diffusion. Validated on the Ornstein--Uhlenbeck process, the double-well Langevin system, and a multiplicative diffusion process, the method recovers all active polynomial generators with coefficient errors below 4\%, stationary-density total-variation distances below 0.01, and autocorrelation functions that faithfully reproduce true relaxation timescales across all three benchmarks.

Regulating Prompt Markets: Securities Law, Intellectual Property, and the Trading of Prompt Assets

Dev.to

Mercor competitor Deccan AI raises $25M, sources experts from India

Dev.to

How We Got Local MCP Servers Working in Claude Cowork (The Missing Guide)

Dev.to

How Should Students Document AI Usage in Academic Work?

Dev.to

They Did Not Accidentally Make Work the Answer to Who You Are

Dev.to

Sparse Weak-Form Discovery of Stochastic Generators

Key Points

Abstract

Related Articles

Regulating Prompt Markets: Securities Law, Intellectual Property, and the Trading of Prompt Assets

Mercor competitor Deccan AI raises $25M, sources experts from India

How We Got Local MCP Servers Working in Claude Cowork (The Missing Guide)

How Should Students Document AI Usage in Academic Work?

They Did Not Accidentally Make Work the Answer to Who You Are

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer