A Large-Scale Comparative Analysis of Imputation Methods for Single-Cell RNA Sequencing Data

arXiv cs.LG / 3/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • scRNA-seq のドロップアウトによるゼロのスパース性が分布を歪め、下流解析を損なうため、これを補正する imputation 手法の比較が課題になっていたと述べている。
  • 著者らは、従来型〜DLベースまで 15 手法(7カテゴリ)を 30データセット(10プロトコル)・6つの下流タスクで大規模にベンチマークし、包括的な性能評価を提示している。
  • 結果として、モデルベース/平滑化/低ランク行列などの従来手法が、拡散・GAN・GNN・オートエンコーダ等のDL手法より概して優れることを報告している。
  • 数値的な遺伝子発現の回復精度が高くても、生物学的解釈性の向上に必ずしも結びつかないこと、さらにデータセット/プロトコル/タスク依存で成績が大きく変動し、常に最強の単一手法は存在しないことを示している。

Abstract

Single-cell RNA sequencing (scRNA-seq) is inherently affected by sparsity caused by dropout events, in which expressed genes are recorded as zeros due to technical limitations. These artifacts distort gene expression distributions and can compromise downstream analyses. Numerous imputation methods have been proposed to address this, and these methods encompass a wide range of approaches from traditional statistical models to recently developed deep learning (DL)-based methods. However, their comparative performance remains unclear, as existing benchmarking studies typically evaluate only a limited subset of methods, datasets, and downstream analytical tasks. Here, we present a comprehensive benchmark of 15 scRNA-seq imputation methods spanning 7 methodological categories, including traditional and modern DL-based methods. These methods are evaluated across 30 datasets sourced from 10 experimental protocols and assessed in terms of 6 downstream analytical tasks. Our results show that traditional imputation methods, such as model-based, smoothing-based, and low-rank matrix-based methods, generally outperform DL-based methods, such as diffusion-based, GAN-based, GNN-based, and autoencoder-based methods. In addition, strong performance in numerical gene expression recovery does not necessarily translate into improved biological interpretability in downstream analyses. Furthermore, the performance of imputation methods varies substantially across datasets, protocols, and downstream analytical tasks, and no single method consistently outperforms others across all evaluation scenarios. Together, our results provide practical guidance for selecting imputation methods tailored to specific analytical objectives and highlight the importance of task-specific evaluation when assessing imputation performance in scRNA-seq data analysis.