Rxivist logo

Regulatory network-based imputation of dropouts in single-cell RNA sequencing data

By Ana Carolina Leote, Xiaohui Wu, Andreas Beyer

Posted 22 Apr 2019
bioRxiv DOI: 10.1101/611517

Single-cell RNA sequencing (scRNA-seq) methods are typically unable to quantify the expression levels of all genes in a cell, creating a need for the computational prediction of missing values ('dropout imputation'). Most existing dropout imputation methods are limited in the sense that they exclusively use the scRNA-seq dataset at hand and do not exploit external gene-gene relationship information. Here, we show that a transcriptional regulatory network learned from external, independent gene expression data improves dropout imputation. Using a variety of human scRNA-seq datasets we demonstrate that our network-based approach outperforms published state-of-the-art methods. The network-based approach performs particularly well for lowly expressed genes, including cell-type specific transcriptional regulators. Additionally, we tested a baseline approach, where we imputed missing values using the sample-wide average expression of a gene. Unexpectedly, up to 48% of the genes were better predicted using this baseline approach, suggesting negligible cell-to-cell variation of expression levels for many genes. Our work shows that there is no single best imputation method; rather, the best method depends on gene-specific features, such as expression level and expression variation across cells. We thus implemented an R-package called ADImpute (available from https://github.com/anacarolinaleote/ADImpute) that automatically determines the best imputation method for each gene in a dataset.

Download data

  • Downloaded 1,303 times
  • Download rankings, all-time:
    • Site-wide: 19,894
    • In bioinformatics: 2,218
  • Year to date:
    • Site-wide: 20,003
  • Since beginning of last month:
    • Site-wide: 46,995

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide