Rxivist logo

Optimal gene selection for cell type discrimination in single cell analyses

By Bianca Dumitrascu, Soledad Villar, Dustin G. Mixon, Barbara E. Engelhardt

Posted 04 Apr 2019
bioRxiv DOI: 10.1101/599654

Single-cell technologies characterize complex cell populations across multiple data modalities at unprecedented scale and resolution. Multiomic data for single cell gene expression, in situ hybridization, or single cell chromatin states are increasingly available across diverse tissue types. When isolating specific cell types from a sample of disassociated cells or performing in situ sequencing in collections of heterogeneous cells, one challenging task is to select a small set of informative markers to identify and differentiate specific cell types or cell states as precisely as possible. Given single cell RNA-seq data and a set of cellular labels to discriminate, scGeneFit, selects gene transcript markers that jointly optimize cell label recovery using label-aware compressive classification methods, resulting in a substantially more robust and less redundant set of markers than existing methods. When applied to a data set given a hierarchy of cell type labels, the markers found by our method enable the recovery of the label hierarchy through a computationally efficient and principled optimization.

Download data

  • Downloaded 2,259 times
  • Download rankings, all-time:
    • Site-wide: 2,982 out of 89,211
    • In genomics: 561 out of 5,693
  • Year to date:
    • Site-wide: 5,251 out of 89,211
  • Since beginning of last month:
    • Site-wide: 4,604 out of 89,211

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)