Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 62,303 bioRxiv papers from 276,577 authors.

Alevin efficiently estimates accurate gene abundances from dscRNA-seq data

By Avi Srivastava, Laraib Malik, Tom Sean Smith, Ian Sudbery, Rob Patro

Posted 01 Jun 2018
bioRxiv DOI: 10.1101/335000 (published DOI: 10.1186/s13059-019-1670-y)

We introduce alevin, an efficient pipeline for gene quantification from dscRNA-seq (droplet-based single-cell RNA-seq) data. Alevin is an end-to-end quantification pipeline that starts from sample-demultiplexed FASTQ files and generates gene-level counts for two popular droplet-based sequencing protocols (drop-seq [1], and 10x-chromium [2]). Importantly, alevin handles all processing internally, avoiding reliance on external pipeline programs, and the need to write large intermediate files to disk. Alevin adopts efficient algorithms for cellular-barcode whitelist generation, cellular-barcode correction, lightweight per-cell UMI deduplication and quantification. This integrated solution allows alevin to process data much faster (typically ~ 10 times faster) than other approaches, while also working within a reasonable memory budget. This enables full, end-to-end analysis for single-cell human experiment consisting of ~ 4500 cells with 335 Million reads with 13G of RAM and 8 threads (of an Intel Xeon E5-2699 v4 CPU) in 27 minutes.

Download data

  • Downloaded 2,187 times
  • Download rankings, all-time:
    • Site-wide: 1,877 out of 62,303
    • In bioinformatics: 384 out of 6,225
  • Year to date:
    • Site-wide: 1,536 out of 62,303
  • Since beginning of last month:
    • Site-wide: 4,742 out of 62,303

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide

Sign up for the Rxivist weekly newsletter! (Click here for more details.)