Rxivist logo

Polyester: simulating RNA-seq datasets with differential transcript expression

By Alyssa C Frazee, Andrew E. Jaffe, Ben Langmead, Jeffrey T. Leek

Posted 06 Jun 2014
bioRxiv DOI: 10.1101/006015 (published DOI: 10.1093/bioinformatics/btv272)

Motivation: Statistical methods development for differential expression analysis of RNA sequencing (RNA-seq) requires software tools to assess accuracy and error rate control. Since true differential expression status is often unknown in experimental datasets, artificially-constructed datasets must be utilized, either by generating costly spike-in experiments or by simulating RNA-seq data. Results: Polyester is an R package designed to simulate RNA-seq data, beginning with an experimental design and ending with collections of RNA-seq reads. Its main advantage is the ability to simulate reads indicating isoform-level differential expression across biological replicates for a variety of experimental designs. Data generated by Polyester is a reasonable approximation to real RNA-seq data and standard differential expression workflows can recover differential expression set in the simulation by the user. Availability and Implementation: Polyester is freely available from Bioconductor (http://bioconductor.org/).

Download data

  • Downloaded 2,279 times
  • Download rankings, all-time:
    • Site-wide: 3,131 out of 94,912
    • In bioinformatics: 555 out of 8,837
  • Year to date:
    • Site-wide: 33,111 out of 94,912
  • Since beginning of last month:
    • Site-wide: 19,117 out of 94,912

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)