Structure-guided molecular design with contrastive 3D protein-ligand learning

arXiv cs.LG / 4/22/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses structure-based drug discovery challenges by jointly tackling accurate 3D protein–ligand interaction modeling and the search over ultra-large chemical spaces while keeping candidates synthetically accessible.
  • It proposes an SE(3)-equivariant transformer that learns a shared embedding for ligand and pocket structures using contrastive 3D learning, enabling competitive zero-shot virtual screening.
  • It extends the approach with a multimodal Chemical Language Model (MCLM) that generates target-specific molecules conditioned on pocket or ligand structure inputs.
  • A learned dataset token is used to steer generation toward targeted chemical spaces, producing candidates with favorable predicted binding properties across a range of targets.
  • Overall, the method unifies structure-guided representation learning with conditional autoregressive molecular generation to improve practical candidate selection.

Abstract

Structure-based drug discovery faces the dual challenge of accurately capturing 3D protein-ligand interactions while navigating ultra-large chemical spaces to identify synthetically accessible candidates. In this work, we present a unified framework that addresses these challenges by combining contrastive 3D structure encoding with autoregressive molecular generation conditioned on commercial compound spaces. First, we introduce an SE(3)-equivariant transformer that encodes ligand and pocket structures into a shared embedding space via contrastive learning, achieving competitive results in zero-shot virtual screening. Second, we integrate these embeddings into a multimodal Chemical Language Model (MCLM). The model generates target-specific molecules conditioned on either pocket or ligand structures, with a learned dataset token that steers the output toward targeted chemical spaces, yielding candidates with favorable predicted binding properties across diverse targets.