Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 45,367 bioRxiv papers from 204,202 authors.

Feature Selection and Dimension Reduction for Single Cell RNA-Seq based on a Multinomial Model

By F. William Townes, Stephanie C Hicks, Martin Aryee, Rafael A. Irizarry

Posted 11 Mar 2019
bioRxiv DOI: 10.1101/574574

Single cell RNA-Seq (scRNA-Seq) profiles gene expression of individual cells. Recent scRNA-Seq datasets have incorporated unique molecular identifiers (UMIs). Using negative controls, we show UMI counts follow multinomial sampling with no zero-inflation. Current normalization procedures such as log of counts per million and feature selection by highly variable genes produce false variability in dimension reduction. We propose simple multinomial methods, including generalized principal component analysis (GLM-PCA) for non-normal distributions, and feature selection using deviance. These methods outperform current practice in a downstream clustering assessment using ground-truth datasets.

Download data

No bioRxiv download data for this paper yet.

Altmetric data

Sign up for the Rxivist weekly newsletter! (Click here for more details.)