AI Navigate

Multimodal Protein Language Models for Enzyme Kinetic Parameters: From Substrate Recognition to Conformational Adaptation

arXiv cs.CV / 3/16/2026

📰 NewsModels & Research

Key Points

  • The paper introduces the Enzyme-Reaction Bridging Adapter (ERBA), a two-stage multimodal conditioning framework that injects substrate information into Protein Language Models to improve kinetic parameter predictions.
  • In the Molecular Recognition Cross-Attention (MRCA) stage, substrate information is injected into the enzyme representation to capture substrate specificity.
  • In the Geometry-aware Mixture-of-Experts (G-MoE) stage, active-site structure is integrated and samples are routed to pocket-specialized experts to reflect induced fit.
  • The Enzyme-Substrate Distribution Alignment (ESDA) enforces distributional consistency within the PLM manifold via a reproducing kernel Hilbert space constraint, maintaining semantic fidelity.
  • Across three kinetic endpoints and multiple backbones, ERBA yields consistent gains and stronger out-of-distribution performance compared with sequence-only and shallow-fusion baselines, and it sets up a foundation for adding cofactors, mutations, and time-resolved structural cues.

Abstract

Predicting enzyme kinetic parameters quantifies how efficiently an enzyme catalyzes a specific substrate under defined biochemical conditions. Canonical parameters such as the turnover number (k_\text{cat}), Michaelis constant (K_\text{m}), and inhibition constant (K_\text{i}) depend jointly on the enzyme sequence, the substrate chemistry, and the conformational adaptation of the active site during binding. Many learning pipelines simplify this process to a static compatibility problem between the enzyme and substrate, fusing their representations through shallow operations and regressing a single value. Such formulations overlook the staged nature of catalysis, which involves both substrate recognition and conformational adaptation. In this regard, we reformulate kinetic prediction as a staged multimodal conditional modeling problem and introduce the Enzyme-Reaction Bridging Adapter (ERBA), which injects cross-modal information via fine-tuning into Protein Language Models (PLMs) while preserving their biochemical priors. ERBA performs conditioning in two stages: Molecular Recognition Cross-Attention (MRCA) first injects substrate information into the enzyme representation to capture specificity; Geometry-aware Mixture-of-Experts (G-MoE) then integrates active-site structure and routes samples to pocket-specialized experts to reflect induced fit. To maintain semantic fidelity, Enzyme-Substrate Distribution Alignment (ESDA) enforces distributional consistency within the PLM manifold in a reproducing kernel Hilbert space. Experiments across three kinetic endpoints and multiple PLM backbones, ERBA delivers consistent gains and stronger out-of-distribution performance compared with sequence-only and shallow-fusion baselines, offering a biologically grounded route to scalable kinetic prediction and a foundation for adding cofactors, mutations, and time-resolved structural cues.