AI Navigate

Multi-objective Genetic Programming with Multi-view Multi-level Feature for Enhanced Protein Secondary Structure Prediction

arXiv cs.LG / 3/16/2026

📰 NewsModels & Research

Key Points

  • The paper introduces MOGP-MMF, a multi-objective genetic programming framework that reframes protein secondary structure prediction (PSSP) as an automated feature selection and fusion optimization problem.
  • It employs a multi-view, multi-level representation (evolutionary, semantic, and structural views) and an enhanced operator set to evolve both linear and nonlinear feature fusion functions, capturing high-order interactions while managing fusion complexity.
  • A knowledge transfer mechanism leverages prior evolutionary experience to guide the population toward global optima, addressing the accuracy–complexity trade-off.
  • Experimental results on seven benchmark datasets show improved Q8 accuracy and structural integrity, plus a diverse set of non-dominated solutions; the authors also provide GitHub code for reproducibility.

Abstract

Predicting protein secondary structure is essential for understanding protein function and advancing drug discovery. However, the intricate sequence-structure relationship poses significant challenges for accurate modeling. To address these, we propose MOGP-MMF, a multi-objective genetic programming framework that reformulates PSSP as an automated optimization task focused on feature selection and fusion. Specifically, MOGP-MMF introduces a multi-view multi-level representation strategy that integrates evolutionary, semantic, and newly introduced structural views to capture the comprehensive protein folding logic. Leveraging an enriched operator set, the framework evolves both linear and nonlinear fusion functions, effectively capturing high-order feature interactions while reducing fusion complexity. To resolve the accuracy-complexity trade-off, an improved multi-objective GP algorithm is developed, incorporating a knowledge transfer mechanism that utilizes prior evolutionary experience to guide the population toward global optima. Extensive experiments across seven benchmark datasets demonstrate that MOGP-MMF surpasses state-of-the-art methods, particularly in Q8 accuracy and structural integrity. Furthermore, MOGP-MMF generates a diverse set of non-dominated solutions, offering flexible model selection schemes for various practical application scenarios. The source code is available on GitHub: https://github.com/qian-ann/MOGP-MMF/tree/main.