Rxivist logo

dropClust: Efficient clustering of ultra-large scRNA-seq data

By Debajyoti Sinha, Akhilesh Kumar, Himanshu Kumar, Sanghamitra Bandyopadhyay, Debarka Sengupta

Posted 31 Jul 2017
bioRxiv DOI: 10.1101/170308 (published DOI: 10.1093/nar/gky007)

Droplet based single cell transcriptomics has recently enabled parallel screening of tens of thousands of single cells. Clustering methods that scale for such high dimensional data without compromising accuracy are scarce. We exploit Locality Sensitive Hashing, an approximate nearest neighbor search technique to develop a de novo clustering algorithm for large-scale single cell data. On a number of real datasets, dropClust outperformed the existing best practice methods in terms of execution time, clustering accuracy and detectability of minor cell sub-types.

Download data

  • Downloaded 1,240 times
  • Download rankings, all-time:
    • Site-wide: 26,423
    • In genomics: 2,255
  • Year to date:
    • Site-wide: 194,493
  • Since beginning of last month:
    • Site-wide: 83,773

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide