Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 57,789 bioRxiv papers from 265,997 authors.
Computational proteogenomic identification and functional interpretation of translated fusions and micro structural variations in cancer
Yen Yi Lin,
S. Cenk Sahinalp
Posted 25 Jul 2017
bioRxiv DOI: 10.1101/168377 (published DOI: 10.1093/bioinformatics/btx807)
Posted 25 Jul 2017
Rapid advancement in high throughput genome and transcriptome sequencing (HTS) and mass spectrometry (MS) technologies has enabled the acquisition of the genomic, transcriptomic and proteomic data from the same tissue sample.In this paper we introduce a novel computational framework which can integratively analyze all three types of omics data to obtain a complete molecular profile of a tissue sample, in normal and disease conditions.Our framework includes MiStrVar, an algorithmic method we developed to identify micro structural variants (microSVs) on genomic HTS data. Coupled with deFuse, a popular gene fusion detection method we developed earlier, MiStrVar can provide an accurate profile of structurally aberrant transcripts in cancer samples.Given the breakpoints obtained by MiStrVar and deFuse, our framework can then identify all relevant peptides that span the breakpoint junctions and match them with unique proteomic signatures in the respective proteomics data sets. Our framework's ability to observe structural aberrations at three levels of omics data provides means of validating their presence. We have applied our framework to all The Cancer Genome Atlas (TCGA) breast cancer Whole Genome Sequencing (WGS) and/or RNA-Seq data sets, spanning all four major subtypes, for which proteomics data from Clinical Proteomic Tumor Analysis Consortium (CPTAC) have been released.A recent study on this dataset focusing on SNVs has reported many that lead to novel peptides.Complementing and significantly broadening this study, we detected 244 novel peptides from 432 candidate genomic or transcriptomic sequence aberrations. Many of the fusions and microSVs we discovered have not been reported in the literature.Interestingly, the vast majority of these translated aberrations (in particular, fusions) were private, demonstrating the extensive inter-genomic heterogeneity present in breast cancer.Many of these aberrations also have matching out-of-frame downstream peptides, potentially indicating novel protein sequence and structure. Moreover, the most significantly enriched genes involved in translated fusions are cancer-related.Furthermore a number of the somatic, translated microSVs are observed in tumor suppressor genes.
- Downloaded 372 times
- Download rankings, all-time:
- Site-wide: 24,504 out of 57,789
- In bioinformatics: 3,400 out of 5,897
- Year to date:
- Site-wide: 33,815 out of 57,789
- Since beginning of last month:
- Site-wide: 24,917 out of 57,789
Downloads over time
Distribution of downloads per paper, site-wide
- Top preprints of 2018
- Paper search
- Author leaderboards
- Overall metrics
- The API
- Email newsletter
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!