Multi-RF Fusion with Multi-GNN Blending for Molecular Property Prediction
arXiv cs.AI / 2026/3/24
📰 ニュースSignals & Early TrendsIdeas & Deep AnalysisModels & Research
要点
- The arXiv paper proposes Multi-RF Fusion with Multi-GNN Blending for molecular property prediction, reporting a test ROC-AUC of 0.8476 ± 0.0002 on the ogbg-molhiv benchmark (10 seeds) and claiming #1 on the OGB leaderboard over HyperFusion.
- The approach combines a rank-averaged ensemble of 12 Random Forests trained on a large concatenated fingerprint vector (FCFP, ECFP, MACCS, and atom pairs; 4,263 dimensions) with blended deep-ensembled GNN predictions at 12% weight.
- Two key improvements are highlighted: using max_features = 0.20 for Random Forests (instead of the default sqrt(d)) boosts AUC by about +0.008 on a scaffold split.
- The method reduces GNN-related randomness by averaging GNN outputs across 10 seeds before blending, effectively eliminating GNN seed variance and reducing final performance standard deviation from 0.0008 to 0.0002.
- The results are achieved without external data or any pre-training, emphasizing the effectiveness of the ensemble/blending and tuning strategy.
