Rxivist logo

Detection and benchmarking of somatic mutations in cancer genomes using RNA-seq data

By Alexandre Coudray, Anna M Battenhouse, Philipp Bucher, Vishwanath R. Iyer

Posted 17 Jan 2018
bioRxiv DOI: 10.1101/249219 (published DOI: 10.7717/peerj.5362)

To detect functional somatic mutations in tumor samples, whole-exome sequencing (WES) is often used for its reliability and relative low cost. RNA-seq, while generally used to measure gene expression, can potentially also be used for identification of somatic mutations. However there has been little systematic evaluation of the utility of RNA-seq for identifying somatic mutations. Here, we develop and evaluate a pipeline for processing RNA-seq data from glioblastoma multiforme (GBM) tumors in order to identify somatic mutations. The pipeline entails the use of the STAR aligner 2-pass procedure jointly with MuTect2 from GATK to detect somatic variants. Variants identified from RNA-seq data were evaluated by comparison against the COSMIC and dbSNP databases, and also compared to somatic variants identified by exome sequencing. We also estimated the putative functional impact of coding variants in the most frequently mutated genes in GBM. Interestingly, variants identified by RNA-seq alone showed better representation of GBM-related mutations cataloged by COSMIC. RNA-seq-only data substantially outperformed the ability of WES to reveal potentially new somatic mutations in known GBM-related pathways, and allowed us to build a high-quality set of somatic mutations common to exome and RNA-seq calls. Using RNA-seq data in parallel with WES data to detect somatic mutations in cancer genomes can thus broaden the scope of discoveries and lend additional support to somatic variants identified by exome sequencing alone.

Download data

  • Downloaded 3,095 times
  • Download rankings, all-time:
    • Site-wide: 5,211
    • In cancer biology: 75
  • Year to date:
    • Site-wide: 38,444
  • Since beginning of last month:
    • Site-wide: 113,513

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide