Rxivist logo

Super-delta2: An Enhanced Differential Expression Analysis Procedure for Multi-Group Comparisons of RNA-seq Data

By Zihan Cui, Yuhang Liu, Jinfeng Zhang, Xing Qiu

Posted 01 Feb 2021
bioRxiv DOI: 10.1101/2021.01.30.428977

Background: We developed super-delta2, a differential gene expression analysis pipeline designed for multi-group comparisons for RNA-seq data. It includes a customized one-way ANOVA F-test and a post-hoc test for pairwise group comparisons; both are designed to work with a multivariate normalization procedure to reduce technical noise. It also includes a trimming procedure with bias-correction to obtain robust and approximately unbiased summary statistics used in these tests. We demonstrated the asymptotic applicability of super-delta2 to log-transformed read counts in RNA-seq data by large sample theory based on Negative Binomial Poisson (NBP) distribution. Results: We compared super-delta2 with three commonly used RNA-seq data analysis methods: limma/voom, edgeR, and DESeq2 using both simulated and real datasets. In all three simulation settings, super-delta2 not only achieved the best overall statistical power, but also was the only method that controlled type I error at the nominal level. When applied to a breast cancer dataset to identify differential expression pattern associated with multiple pathologic stages, super-delta2 selected more enriched pathways than other methods, which are directly linked to the underlying biological condition (breast cancer). Conclusions: By incorporating trimming and bias-correction in the normalization step, super-delta2 was able to achieve tight control of type I error. Because the hypothesis tests are based on asymptotic normal approximation of the NBP distribution, super-delta2 does not require computationally expensive iterative optimization procedures used by methods such as edgeR and DESeq2, which occasionally have convergence issues.

Download data

  • Downloaded 145 times
  • Download rankings, all-time:
    • Site-wide: 146,156
    • In bioinformatics: 11,203
  • Year to date:
    • Site-wide: 63,483
  • Since beginning of last month:
    • Site-wide: 31,315

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide