Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 54,968 bioRxiv papers from 253,677 authors.

Alevin efficiently estimates accurate gene abundances from dscRNA-seq data

By Avi Srivastava, Laraib Malik, Tom Sean Smith, Ian Sudbery, Rob Patro

Posted 01 Jun 2018
bioRxiv DOI: 10.1101/335000 (published DOI: 10.1186/s13059-019-1670-y)

We introduce alevin, an efficient pipeline for gene quantification from dscRNA-seq (droplet-based single-cell RNA-seq) data. Alevin is an end-to-end quantification pipeline that starts from sample-demultiplexed FASTQ files and generates gene-level counts for two popular droplet-based sequencing protocols (drop-seq [1], and 10x-chromium [2]). Importantly, alevin handles all processing internally, avoiding reliance on external pipeline programs, and the need to write large intermediate files to disk. Alevin adopts efficient algorithms for cellular-barcode whitelist generation, cellular-barcode correction, lightweight per-cell UMI deduplication and quantification. This integrated solution allows alevin to process data much faster (typically ~ 10 times faster) than other approaches, while also working within a reasonable memory budget. This enables full, end-to-end analysis for single-cell human experiment consisting of ~ 4500 cells with 335 Million reads with 13G of RAM and 8 threads (of an Intel Xeon E5-2699 v4 CPU) in 27 minutes.

Download data

  • Downloaded 1,986 times
  • Download rankings, all-time:
    • Site-wide: 1,906 out of 54,968
    • In bioinformatics: 398 out of 5,689
  • Year to date:
    • Site-wide: 1,190 out of 54,968
  • Since beginning of last month:
    • Site-wide: 7,172 out of 54,968

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News