Background: We developed super-delta2, a differential gene expression analysis pipeline designed for multi-group comparisons for RNA-seq data. It includes a customized one-way ANOVA F-test and a post-hoc test for pairwise group comparisons; both are designed to work with a multivariate normalization procedure to reduce technical noise. It also includes a trimming procedure with bias-correction to obtain robust and approximately unbiased summary statistics used in these tests. We demonstrated the asymptotic applicability of super-delta2 to log-transformed read counts in RNA-seq data by large sample theory based on Negative Binomial Poisson (NBP) distribution. Results: We compared super-delta2 with three commonly used RNA-seq data analysis methods: limma/voom, edgeR, and DESeq2 using both simulated and real datasets. In all three simulation settings, super-delta2 not only achieved the best overall statistical power, but also was the only method that controlled type I error at the nominal level. When applied to a breast cancer dataset to identify differential expression pattern associated with multiple pathologic stages, super-delta2 selected more enriched pathways than other methods, which are directly linked to the underlying biological condition (breast cancer). Conclusions: By incorporating trimming and bias-correction in the normalization step, super-delta2 was able to achieve tight control of type I error. Because the hypothesis tests are based on asymptotic normal approximation of the NBP distribution, super-delta2 does not require computationally expensive iterative optimization procedures used by methods such as edgeR and DESeq2, which occasionally have convergence issues.
- Downloaded 145 times
- Download rankings, all-time:
- Site-wide: 146,156
- In bioinformatics: 11,203
- Year to date:
- Site-wide: 63,483
- Since beginning of last month:
- Site-wide: 31,315
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!