SynLeaF: A Dual-Stage Multimodal Fusion Framework for Synthetic Lethality Prediction Across Pan- and Single-Cancer Contexts

arXiv cs.AI / 3/25/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

Key Points

  • The study introduces SynLeaF, a dual-stage multimodal fusion framework designed to improve synthetic lethality (SL) prediction by effectively combining heterogeneous omics data across both pan-cancer and single-cancer settings.
  • SynLeaF uses a VAE-based cross-encoder with a product-of-experts approach to fuse four omics modalities (gene expression, mutation, methylation, and CNV) while also leveraging a relational graph convolutional network over biomedical knowledge graphs.
  • To address “modality laziness,” the method applies dual-stage training with feature-level knowledge distillation using adaptive uni-modal teachers and an ensemble strategy to balance convergence across modalities.
  • Experiments spanning eight cancer types plus a pan-cancer dataset show SynLeaF outperforms prior approaches in 17 out of 19 scenarios, with ablations and gradient analyses supporting the robustness and generalization contributions of the fusion/distillation components.
  • The authors provide a community-accessible web server for using SynLeaF for SL prediction (https://synleaf.bioinformatics-lilab.cn).

Abstract

Accurate prediction of synthetic lethality (SL) is important for guiding the development of cancer drugs and therapies. SL prediction faces significant challenges in the effective fusion of heterogeneous multi-source data. Existing multimodal methods often suffer from "modality laziness" due to disparate convergence speeds, which hinders the exploitation of complementary information. This is also one reason why most existing SL prediction models cannot perform well on both pan-cancer and single-cancer SL pair prediction. In this study, we propose SynLeaF, a dual-stage multimodal fusion framework for SL prediction across pan- and single-cancer contexts. The framework employs a VAE-based cross-encoder with a product of experts mechanism to fuse four omics data types (gene expression, mutation, methylation, and CNV), while simultaneously utilizing a relational graph convolutional network to capture structured gene representations from biomedical knowledge graphs. To mitigate modality laziness, SynLeaF introduces a dual-stage training mechanism employing featurelevel knowledge distillation with adaptive uni-modal teacher and ensemble strategies. In extensive experiments across eight specific cancer types and a pancancer dataset, SynLeaF achieves superior performance in 17 out of 19 scenarios. Ablation studies and gradient analyses further validate the critical contributions of the proposed fusion and distillation mechanisms to model robustness and generalization. To facilitate community use, a web server is available at https://synleaf.bioinformatics-lilab.cn.