DPD-Cancer: Explainable Graph-based Deep Learning for Small Molecule Anti-Cancer Activity Prediction

arXiv cs.AI / 3/30/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

Key Points

  • The paper introduces DPD-Cancer, a graph attention transformer–based deep learning framework for predicting small-molecule anti-cancer activity, including both classification and cell-line–specific growth inhibition (pGI50).
  • It targets limitations of prior drug-response models that struggle with non-linear relationships between molecular structure and cellular context across heterogeneous cancer cell lines.
  • In benchmarks against methods such as pdCSM-cancer, ACLPred, and MLASM, DPD-Cancer reports strong results, including AUC up to 0.87 on strictly partitioned NCI60 data and up to 0.98 on ACLPred/MLASM datasets.
  • For pGI50 prediction across 10 cancer types and 73 cell lines, it achieves Pearson correlations up to 0.72 on independent test sets.
  • The work emphasizes interpretability by using the attention mechanism to identify and visualize relevant molecular substructures, and it provides a freely accessible web server for lead optimization.

Abstract

Accurate drug response prediction is a critical bottleneck in computational biochemistry, limited by the challenge of modelling the interplay between molecular structure and cellular context. In cancer research, this is acute due to tumour heterogeneity and genomic variability, which hinder the identification of effective therapies. Conventional approaches often fail to capture non-linear relationships between chemical features and biological outcomes across diverse cell lines. To address this, we introduce DPD-Cancer, a deep learning method based on a Graph Attention Transformer (GAT) framework. It is designed for small molecule anti-cancer activity classification and the quantitative prediction of cell-line specific responses, specifically growth inhibition concentration (pGI50). Benchmarked against state-of-the-art methods (pdCSM-cancer, ACLPred, and MLASM), DPD-Cancer demonstrated superior performance, achieving an Area Under ROC Curve (AUC) of up to 0.87 on strictly partitioned NCI60 data and up to 0.98 on ACLPred/MLASM datasets. For pGI50 prediction across 10 cancer types and 73 cell lines, the model achieved Pearson's correlation coefficients of up to 0.72 on independent test sets. These findings confirm that attention-based mechanisms offer significant advantages in extracting meaningful molecular representations, establishing DPD-Cancer as a competitive tool for prioritising drug candidates. Furthermore, DPD-Cancer provides explainability by leveraging the attention mechanism to identify and visualise specific molecular substructures, offering actionable insights for lead optimisation. DPD-Cancer is freely available as a web server at: https://biosig.lab.uq.edu.au/dpd_cancer/.