Motivation: One of the main benefits of using modern RNA-sequencing (RNA-seq) technology is the more accurate gene expression estimations. However, numerous issues can result in the possibility that an RNA-seq read can be mapped to multiple locations on the reference genome with the same alignment scores, which occurs in plant, animal, and metagenome samples. Such a read is so-called a multiple mapping read (MMR). The impact of these MMRs is reflected in gene expression estimation and all downstream analyses, including differential gene expression, functional enrichment, etc. Current analysis pipelines lack the tools to test the reliability of gene expression estimations, thus are incapable of ensuring the validity of all downstream analyses. Results: Our investigation into 95 RNA-seq datasets from seven species (totaling 1,951GB) indicates an average of roughly 22% of all reads are MMRs for plant and animal species. Here we present a tool called GeneQC (Gene expression Quality Control), which can accurately estimate the reliability of each gene's expression level. The underlying algorithm is designed based on extracted genomic and transcriptomic features through extensive use of mathematical and statistical modeling and design. GeneQC utilizes big data-driven mathematical modeling approaches and allows researchers to determine reliable expression estimations and conduct further analysis on the gene expression that are of sufficient quality. This tool also enables researchers to investigate continued analysis to determine more accurate gene expression estimates for those with low reliability.
- Downloaded 729 times
- Download rankings, all-time:
- Site-wide: 35,688
- In bioinformatics: 3,931
- Year to date:
- Site-wide: 95,711
- Since beginning of last month:
- Site-wide: 114,259
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!