Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 72,916 bioRxiv papers from 317,499 authors.

Gene Expression Distribution Deconvolution in Single Cell RNA Sequencing

By Jingshu Wang, Mo Huang, Eduardo Torre, Hannah Dueck, Sydney Shaffer, John Murray, Arjun Raj, Mingyao Li, Nancy R Zhang

Posted 30 Nov 2017
bioRxiv DOI: 10.1101/227033 (published DOI: 10.1073/pnas.1721085115)

Single-cell RNA sequencing (scRNA-seq) enables the quantification of each gene's expression distribution across cells, thus allowing the assessment of the dispersion, burstiness, and other aspects of its distribution beyond the mean. These statistical characterizations of the gene expression distribution are critical for understanding expression variation and for selecting marker genes for population heterogeneity. However, scRNA-seq data is noisy, with each cell typically sequenced at low coverage, thus making it difficult to infer properties of the gene expression distribution from raw counts. Based on a re-examination of 9 public data sets, we propose a simple technical noise model for scRNA-seq data with Unique Molecular Identifiers (UMI). We develop DESCEND, a method that deconvolves the true cross-cell gene expression distribution from observed scRNA-seq counts, leading to improved estimates of properties of the distribution such as dispersion and burstiness. DESCEND can adjust for cell-level covariates such as cell size, cell cycle and batch effects. DESCEND's noise model and estimation accuracy are further evaluated through comparisons to RNA FISH data, through data splitting and simulations, and through its effectiveness in removing known batch effects. We demonstrate how DESCEND can clarify and improve downstream analyses such as finding differentially bursty genes, identifying cell types, and selecting differentiation markers.

Download data

  • Downloaded 1,793 times
  • Download rankings, all-time:
    • Site-wide: 3,229 out of 72,919
    • In genetics: 273 out of 4,008
  • Year to date:
    • Site-wide: 27,174 out of 72,919
  • Since beginning of last month:
    • Site-wide: 27,174 out of 72,919

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)