Germline testing data validate inferences of mutational status for variants detected from tumor-only sequencing
Samantha M Stokes,
Bruce E. Johnson,
Judy E Garber,
Posted 15 Apr 2021
bioRxiv DOI: 10.1101/2021.04.14.439855
Posted 15 Apr 2021
Background: Pathogenic germline variants (PGV) in cancer susceptibility genes are usually identified in cancer patients through germline testing of DNA from blood or saliva: their detection can impact patient treatment options and potential risk reduction strategies for relatives. PGV can also be identified, in tumor sequencing assays, often performed without matched normal specimens. It is then critical to determine whether detected variants are somatic or germline. Here, we evaluate the clinical utility of computational inference of mutational status in tumor-only sequencing compared to germline testing results. Patients and Methods: Tumor-only sequencing data from 1,608 patients were retrospectively analyzed to infer germline-versus-somatic status of variants using an information-theoretic, gene-independent approach. Loss of heterozygosity (LOH) was also determined. The predicted mutational models were compared to clinical germline testing results. Statistical measures were computed to evaluate performance. Results: Tumor-only sequencing detected 3,988 variants across 70 cancer susceptibility genes for which germline testing data were available. Our analysis imputed germline-versus-somatic status for >75% of all detected variants, with a sensitivity of 65%, specificity of 88%, and overall accuracy of 86% for pathogenic variants. False omission rate was 3%, signifying minimal error in misclassifying true PGV. A higher portion of PGV in known hereditary tumor suppressors were found to be retained with LOH in the tumor specimens (72%) compared to variants of uncertain significance (58%). Conclusions: Tumor-only sequencing provides sufficient power to distinguish germline and somatic variants and infer LOH. Although accurate detection of PGV from tumor-only data is possible, analyzing sequencing data in the context of specimens' tumor cell content allows systematic exclusion of somatic variants, and suggests a balance between type 1 and 2 errors for identification of patients with candidate PGV for standard germline testing. Our approach, implemented in a user-friendly bioinformatics application, facilities objective analysis of tumor-only data in clinical settings.
- Downloaded 294 times
- Download rankings, all-time:
- Site-wide: 122,529
- In bioinformatics: 9,934
- Year to date:
- Site-wide: 66,225
- Since beginning of last month:
- Site-wide: 78,478
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!