AI Navigate

Binary Latent Protein Fitness Landscapes for Quantum Annealing Optimization

arXiv cs.LG / 3/19/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisTools & Practical UsageIndustry & Market MovesModels & Research

Key Points

  • Q-BIOLAT proposes a framework for modeling protein fitness landscapes in binary latent spaces derived from pretrained protein language models and then optimizing with a QUBO formulation.
  • It demonstrates that the binary representation can be optimized using classical heuristics (e.g., simulated annealing and genetic algorithms) to identify high-fitness variants on the ProteinGym benchmark.
  • The work highlights a natural bridge between representation learning and combinatorial/quantum optimization, enabling potential use with quantum annealing hardware for protein engineering.
  • An open-source implementation is provided on GitHub for reproducibility and adoption.

Abstract

We propose Q-BIOLAT, a framework for modeling and optimizing protein fitness landscapes in binary latent spaces. Starting from protein sequences, we leverage pretrained protein language models to obtain continuous embeddings, which are then transformed into compact binary latent representations. In this space, protein fitness is approximated using a quadratic unconstrained binary optimization (QUBO) model, enabling efficient combinatorial search via classical heuristics such as simulated annealing and genetic algorithms. On the ProteinGym benchmark, we demonstrate that Q-BIOLAT captures meaningful structure in protein fitness landscapes and enables the identification of high-fitness variants. Despite using a simple binarization scheme, our method consistently retrieves sequences whose nearest neighbors lie within the top fraction of the training fitness distribution, particularly under the strongest configurations. We further show that different optimization strategies exhibit distinct behaviors, with evolutionary search performing better in higher-dimensional latent spaces and local search remaining competitive in preserving realistic sequences. Beyond its empirical performance, Q-BIOLAT provides a natural bridge between protein representation learning and combinatorial optimization. By formulating protein fitness as a QUBO problem, our framework is directly compatible with emerging quantum annealing hardware, opening new directions for quantum-assisted protein engineering. Our implementation is publicly available at: https://github.com/HySonLab/Q-BIOLAT