Boltzmann Machine Learning with a Parallel, Persistent Markov chain Monte Carlo method for Estimating Evolutionary Fields and Couplings from a Protein Multiple Sequence Alignment

arXiv stat.ML / 4/21/2026

💬 OpinionDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

The paper tackles the inverse Potts problem by using a Boltzmann machine approach to estimate evolutionary single-site fields and pairwise couplings from amino-acid frequency statistics in protein multiple sequence alignments.
To address the method’s high computational cost, the authors introduce a parallel, persistent Markov chain Monte Carlo (MCMC) scheme for estimating marginal distributions at each learning step, alongside stochastic gradient descent to speed up training.
Hyperparameter tuning is improved by specifically adjusting two regularization parameters (for fields and couplings) using a condition intended to better match protein conformational requirements, rather than relying on contact-pair prediction precision alone.
The proposed workflow is evaluated on eight protein families, supporting the practicality of the approach for deriving reproducible evolutionary parameters relevant to protein structure and evolution studies.

Abstract

The inverse Potts problem for estimating evolutionary single-site fields and pairwise couplings in homologous protein sequences from their single-site and pairwise amino acid frequencies observed in their multiple sequence alignment would be still one of useful methods in the studies of protein structure and evolution. Since the reproducibility of fields and couplings are the most important, the Boltzmann machine method is employed here, although it is computationally intensive. In order to reduce computational time required for the Boltzmann machine, parallel, persistent Markov chain Monte Carlo method is employed to estimate the single-site and pairwise marginal distributions in each learning step. Also, stochastic gradient descent methods are used to reduce computational time for each learning. Another problem is how to adjust the values of hyperparameters; there are two regularization parameters for evolutionary fields and couplings. The precision of contact residue pair prediction is often used to adjust the hyperparameters. However, it is not sensitive to these regularization parameters. Here, they are adjusted for the fields and couplings to satisfy a specific condition that is appropriate for protein conformations. This method has been applied to eight protein families.

A practical guide to getting comfortable with AI coding tools

Dev.to

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆

Dev.to

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)

Dev.to

🚀 Major BrowserAct CLI Update

Dev.to

Building AgentOS: Why I’m Building the AWS Lambda for Insurance Claims

Dev.to

Boltzmann Machine Learning with a Parallel, Persistent Markov chain Monte Carlo method for Estimating Evolutionary Fields and Couplings from a Protein Multiple Sequence Alignment

Key Points

Abstract

Related Articles

A practical guide to getting comfortable with AI coding tools

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)

🚀 Major BrowserAct CLI Update

Building AgentOS: Why I’m Building the AWS Lambda for Insurance Claims

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer