Nearly all new protein-coding predictions in the CHESS database are not protein-coding
Michael L. Tress,
The GENCODE Consortium
Posted 02 Jul 2018
bioRxiv DOI: 10.1101/360602
Posted 02 Jul 2018
In a 2018 paper posted to bioRxiv, Pertea et al. presented the CHESS database, a new catalog of human gene annotations that includes 1,178 new protein-coding predictions. These are based on evidence of transcription in human tissues and homology to earlier annotations in human and other mammals. Here, we reanalyze the evidence used by CHESS, and find that nearly all protein-coding predictions are false positives. We find that 86% overlap transposons marked by RepeatMasker that are known to frequently result in false positive protein-coding predictions. More than half are homologous to only nine Alu-derived primate sequences corresponding to an erroneous and previously withdrawn Pfam protein domain. The entire set shows poor evolutionary conservation and PhyloCSF protein-coding evolutionary signatures indistinguishable from noncoding RNAs, indicating lack of protein-coding constraint. Only four predictions are supported by mass spectrometry evidence, and even those matches are inconclusive. Overall, the new protein-coding predictions are unsupported by any credible experimental or evolutionary evidence of function, result primarily from homology to genes incorrectly classified as protein-coding, and are unlikely to encode functional proteins.
- Downloaded 1,760 times
- Download rankings, all-time:
- Site-wide: 9,370
- In genomics: 1,013
- Year to date:
- Site-wide: 30,426
- Since beginning of last month:
- Site-wide: 41,674
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!