Rxivist logo

Limitation of alignment-free tools in total RNA-seq quantification

By Douglas C Wu, Jun Yao, Kevin S Ho, Alan M Lambowitz, Claus O Wilke

Posted 11 Jan 2018
bioRxiv DOI: 10.1101/246967 (published DOI: 10.1186/s12864-018-4869-5)

Background: Alignment-free RNA quantification tools have significantly increased the speed of RNA-seq analysis. However, it is unclear whether these state-of-the-art RNA-seq analysis pipelines can quantify small RNAs as accurately as they do with long RNAs in the context of total RNA quantification. Result: We comprehensively tested and compared four RNA-seq pipelines on the accuracies of gene quantification and fold-change estimation on a novel total RNA benchmarking dataset, in which small non-coding RNAs are highly represented along with other long RNAs. The four RNA-seq pipelines were of two commonly-used alignment-free pipelines and two variants of alignment-based pipelines. We found that all pipelines showed high accuracies for quantifying the expressions of long and highly-abundant genes. However, alignment-free pipelines showed systematically poorer performances in quantifying lowly-abundant and small RNAs. Conclusion: We have shown that alignment-free and traditional alignment-based quantification methods performed similarly for common gene targets, such as protein-coding genes. However, we identified a potential pitfall in analyzing and quantifying lowly-expressed genes and small RNAs with alignment-free pipelines, especially when these small RNAs contain mutations.

Download data

  • Downloaded 2,996 times
  • Download rankings, all-time:
    • Site-wide: 5,755
    • In bioinformatics: 546
  • Year to date:
    • Site-wide: None
  • Since beginning of last month:
    • Site-wide: 60,718

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide