Autonomous Adaptive Solver Selection for Chemistry Integration via Reinforcement Learning

arXiv cs.LG / 4/2/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes a constrained reinforcement learning framework that treats chemical solver choice as a Markov decision process, automatically selecting between CVODE (implicit BDF) and a QSS solver during chemistry integration.
Instead of making myopic decisions from local state, the RL agent learns trajectory-aware policies that account for how current solver choices affect downstream error accumulation, while enforcing a user-specified accuracy tolerance via a Lagrangian reward with online multiplier adaptation.
In 0D homogeneous reactor benchmarks, the RL-adaptive policy achieves about a 3× mean speedup (with a wide range up to ~10.6×) while preserving ignition delay and species profiles for a 106-species n-dodecane mechanism, at the cost of roughly 1% inference overhead.
The authors report zero-retraining transfer to 1D counterflow diffusion flames across strain rates 10–2000 s⁻¹, achieving consistent ~2.2× speedup versus CVODE and selecting CVODE for only ~12–15% of space-time points while maintaining near-reference temperature accuracy.

Abstract

The computational cost of stiff chemical kinetics remains a dominant bottleneck in reacting-flow simulation, yet hybrid integration strategies are typically driven by hand-tuned heuristics or supervised predictors that make myopic decisions from instantaneous local state. We introduce a constrained reinforcement learning (RL) framework that autonomously selects between an implicit BDF integrator (CVODE) and a quasi-steady-state (QSS) solver during chemistry integration. Solver selection is cast as a Markov decision process. The agent learns trajectory-aware policies that account for how present solver choices influence downstream error accumulation, while minimizing computational cost under a user-prescribed accuracy tolerance enforced through a Lagrangian reward with online multiplier adaptation. Across sampled 0D homogeneous reactor conditions, the RL-adaptive policy achieves a mean speedup of approximately

3\times

, with speedups ranging from

1.11\times

10.58\times

, while maintaining accurate ignition delays and species profiles for a 106-species \textit{n}-dodecane mechanism and adding approximately

1\%

inference overhead. Without retraining, the 0D-trained policy transfers to 1D counterflow diffusion flames over strain rates

10

2000~\mathrm{s}^{-1}

, delivering consistent

\approx 2.2\times

speedup relative to CVODE while preserving near-reference temperature accuracy and selecting CVODE at only

12

15\%

of space-time points. Overall, the results demonstrate the potential of the proposed reinforcement learning framework to learn problem-specific integration strategies while respecting accuracy constraints, thereby opening a pathway toward adaptive, self-optimizing workflows for multiphysics systems with spatially heterogeneous stiffness.

Black Hat Asia

AI Business

Unitree's IPO

ChinaTalk

Did you know your GIGABYTE laptop has a built-in AI coding assistant? Meet GiMATE Coder 🤖

Dev.to

Benchmarking Batch Deep Reinforcement Learning Algorithms

Dev.to

A bug in Bun may have been the root cause of the Claude Code source code leak.

Reddit r/LocalLLaMA

Autonomous Adaptive Solver Selection for Chemistry Integration via Reinforcement Learning

Key Points

Abstract

Related Articles

Black Hat Asia

Unitree's IPO

Did you know your GIGABYTE laptop has a built-in AI coding assistant? Meet GiMATE Coder 🤖

Benchmarking Batch Deep Reinforcement Learning Algorithms

A bug in Bun may have been the root cause of the Claude Code source code leak.

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer