Artificial intelligence enables comprehensive genome interpretation and nomination of candidate diagnoses for rare genetic diseases
Francisco M. De La Vega,
Edgar Javier Hernandez,
Pankaj B. Agrawal,
Casie A Genetti,
Catherine A Brownstein,
Alan H Beggs,
Shawn E. Levy,
Martin G. Reese,
Stephen F. Kingsmore
Posted 15 Feb 2021
medRxiv DOI: 10.1101/2021.02.09.21251456
Posted 15 Feb 2021
BackgroundClinical interpretation of genetic variants in the context of the patients phenotype is becoming the largest component of cost and time expenditure for genome-based diagnosis of rare genetic diseases. Artificial intelligence (AI) holds promise to greatly simplify and speed interpretation by comprehensively evaluating genetic variants for pathogenicity in the context of the growing knowledge of genetic disease. We assess the diagnostic performance of GEM, a new, AI-based, clinical decision support tool, compared with expert manual interpretation. MethodsWe benchmarked GEM in a retrospective cohort of 119 probands, mostly NICU infants, diagnosed with rare genetic diseases, who received whole genome sequencing (WGS) at Rady Childrens Hospital. We also performed a replication study in a separate cohort of 60 cases diagnosed at five additional academic medical centers. For comparison, we also analyzed these cases with commonly used variant prioritization tools (Phevor, Exomiser, and VAAST). Included in the comparisons were WGS and whole exome sequencing (WES) as trios, duos, and singletons. Variants underpinning diagnoses spanned diverse modes of inheritance and types, including structural variants (SVs). Patient phenotypes were extracted either manually or by automated clinical natural language processing (CNLP) from clinical notes. Finally, 14 previously unsolved cases were re-analyzed. ResultsGEM ranked >90% of causal genes among the top or second candidate, using manually curated or CNLP derived phenotypes, and prioritized a median of 3 genes for review per case. Ranking of trios and duos was unchanged when analyzed as singletons. In 17 of 20 cases with diagnostic SVs, GEM identified the causal SVs as the top or second candidate irrespective of whether SV calls where provided or inferred ab initio by GEM when absent. Analysis of 14 previously unsolved cases provided novel findings in one, candidates ultimately not advanced in 3, and no new findings in 10, demonstrating the utility of GEM for reanalysis. ConclusionsGEM enables automated diagnostic interpretation of WES and WGS for all types of variants, including SVs, nominating a very short list of candidate genes and disorders for final review and reporting. In combination with deep phenotyping by CNLP, GEM enables substantial automation of genetic disease diagnosis, potentially decreasing the cost and speeding case review.
- Downloaded 440 times
- Download rankings, all-time:
- Site-wide: 75,819
- In genetic and genomic medicine: 342
- Year to date:
- Site-wide: 14,231
- Since beginning of last month:
- Site-wide: 23,534
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!