Rxivist logo

Noise regularization removes correlation artifacts in single-cell RNA-seq data preprocessing

By Ruoyu Zhang, Gurinder S. Atwal, Wei Keat Lim

Posted 30 Jul 2020
bioRxiv DOI: 10.1101/2020.07.29.227546

With the rapid advancement of single-cell RNA-seq (scRNA-seq) technology, many data preprocessing methods have been proposed to address numerous systematic errors and technical variabilities inherent in this technology. While these methods have been demonstrated to be effective in recovering individual gene expression, the suitability to the inference of gene-gene associations and subsequent gene networks reconstruction have not been systemically investigated. In this study, we benchmarked five representative scRNA-seq normalization/imputation methods on human cell atlas bone marrow data with respect to their impact on inferred gene-gene associations. Our results suggested that a considerable amount of spurious correlations was introduced during the data preprocessing steps due to over-smoothing of the raw data. We proposed a model-agnostic noise regularization method that can effectively eliminate the correlation artifacts. The noise regularized gene-gene correlations were further used to reconstruct gene co-expression network and successfully revealed several known immune cell modules. ### Competing Interest Statement All authors are employees of Regeneron Pharmaceuticals, and may hold stock and/or stock options in the company as part of their compensation

Download data

  • Downloaded 115 times
  • Download rankings, all-time:
    • Site-wide: 91,223 out of 100,838
    • In bioinformatics: 8,671 out of 9,255
  • Year to date:
    • Site-wide: 54,562 out of 100,838
  • Since beginning of last month:
    • Site-wide: 29,567 out of 100,838

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


  • 20 Oct 2020: Support for sorting preprints using Twitter activity has been removed, at least temporarily, until a new source of social media activity data becomes available.
  • 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
  • 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
  • 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
  • 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
  • 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
  • 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
  • 22 Jan 2019: Nature just published an article about Rxivist and our data.
  • 13 Jan 2019: The Rxivist preprint is live!