General Multimodal Protein Design Enables DNA-Encoding of Chemistry

arXiv cs.LG / 4/8/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • The paper presents DISCO, a multimodal diffusion-based model that co-designs protein sequence and 3D structure to enable DNA-encodable “chemistry” via enzymes designed around arbitrary biomolecules.
  • Unlike prior deep generative protein design approaches, DISCO does not require pre-specifying catalytic residues; it is conditioned only on reactive intermediates to produce diverse heme enzymes with novel active-site geometries.
  • The designed enzymes perform new-to-nature carbene-transfer reactions (e.g., alkene cyclopropanation and spirocyclopropanation) with reported activities that exceed those of existing engineered enzymes.
  • The authors show further gains through random mutagenesis of a selected design, demonstrating that DISCO outputs can be improved by directed evolution.
  • The work includes inference-time scaling/optimization across both sequence and structure modalities and releases code on GitHub to support reproducibility and further research.

Abstract

Evolution is an extraordinary engine for enzymatic diversity, yet the chemistry it has explored remains a narrow slice of what DNA can encode. Deep generative models can design new proteins that bind ligands, but none have created enzymes without pre-specifying catalytic residues. We introduce DISCO (DIffusion for Sequence-structure CO-design), a multimodal model that co-designs protein sequence and 3D structure around arbitrary biomolecules, as well as inference-time scaling methods that optimize objectives across both modalities. Conditioned solely on reactive intermediates, DISCO designs diverse heme enzymes with novel active-site geometries. These enzymes catalyze new-to-nature carbene-transfer reactions, including alkene cyclopropanation, spirocyclopropanation, B-H, and C(sp^3)-H insertions, with high activities exceeding those of engineered enzymes. Random mutagenesis of a selected design further confirmed that enzyme activity can be improved through directed evolution. By providing a scalable route to evolvable enzymes, DISCO broadens the potential scope of genetically encodable transformations. Code is available at https://github.com/DISCO-design/DISCO.