h-MINT: Modeling Pocket-Ligand Binding with Hierarchical Molecular Interaction Network

arXiv cs.LG / 4/28/2026

📰 NewsModels & Research

Key Points

  • The paper introduces h-MINT, a hierarchical molecular interaction network aimed at better modeling pocket–ligand binding by capturing the local chemical environments where interactions like H-bonds and π-stacking occur.
  • It proposes OverlapBPE, a data-driven molecule tokenization approach that allows overlapping fragments to better reflect the fuzzy boundaries of small-molecule substructures while preserving richer chemical context.
  • h-MINT leverages the many-to-many atom–fragment mappings produced by OverlapBPE through a hierarchical architecture that jointly models interactions at atom and fragment levels.
  • Experiments on PDBBind, LBA, DUD-E, LIT-PCBA, and PubChem assays show improved binding affinity prediction (2–4% Pearson/Spearman) and better virtual screening/HTS performance versus state-of-the-art methods, with evidence of strong generalization.

Abstract

Accurate molecular representations are critical for drug discovery, and a central challenge lies in capturing the chemical environment of molecular fragments, as key interactions, such as H-bond and {\pi} stacking, occur only under specific local conditions. Most existing approaches represent molecules as atom-level graphs; however, atom-level representations can hardly express higher-order chemical context (e.g., stereochemistry, lone pairs, conjugation). Fragment-based methods (e.g., principal subgraph, predefined functional groups) fail to preserve essential information such as chirality, aromaticity, and ionic states. This work addresses these limitations from two aspects. (i) OverlapBPE tokenization. We propose a novel data-driven molecule tokenization method. Unlike existing approaches, our method allows overlapping fragments, reflecting the inherently fuzzy boundaries of small-molecule substructures and, together with enriched chemical information at the token level, thereby preserving a more complete chemical context. (ii) h-MINT model. OverlapBPE induces many-to-many atom-fragment mappings, which necessitate a new hierarchical architecture. We therefore develop a hierarchical molecular interaction network capable of jointly modeling interactions at both atom and fragment levels. By supporting fragment overlaps, the model naturally accommodates the many-to-many atom-fragment mappings introduced by the OverlapBPE scheme. Extensive evaluation against state-of-the-art methods shows our method improves binding affinity prediction by 2-4% Pearson/Spearman correlation on PDBBind and LBA, enhances virtual screening by 1-3% in key metrics on DUD-E and LIT-PCBA, and achieves the best overall HTS performance on PubChem assays. Further analysis demonstrates that our method effectively captures interactive information while maintaining good generalization.