Rxivist logo

Identification of misclassified ClinVar variants using disease population prevalence

By Naisha Shah, Ying-Chen Claire Hou, Hung-Chun Yu, Rachana Sainger, Eric Dec, Brad Perkins, C. Thomas Caskey, J. Craig Venter, Amalio Telenti

Posted 15 Sep 2016
bioRxiv DOI: 10.1101/075416 (published DOI: 10.1016/j.ajhg.2018.02.019)

There is a significant interest in the standardized classification of human genetic variants. The availability of new large datasets generated through genome sequencing initiatives provides a ground for the computational evaluation of the supporting evidence. We used whole genome sequence data from 8,102 unrelated individuals to analyze the adequacy of estimated rates of disease on the basis of genetic risk and the expected population prevalence of the disease. Analyses included the ACMG recommended 56 gene-condition sets for incidental findings and 631 genes associated with 348 OrphaNet conditions. A total of 21,004 variants were used to identify patterns of inflation (i.e. excess genetic risk). Inflation, i.e., misclassification, increases as the level of evidence in ClinVar supporting the pathogenic nature of the variant decreases. The burden of rare variants was a main contributing factor of the observed inflation indicating misclassified benign private mutations. We also analyzed the dynamics of re-classification of variant pathogenicity in ClinVar over time. The study strongly suggests that ClinVar includes a significant proportion of wrongly ascertained variants, and underscores the critical role of ClinVar to contrast claims, and foster validation across submitters.

Download data

  • Downloaded 1,793 times
  • Download rankings, all-time:
    • Site-wide: 12,282
    • In genomics: 1,193
  • Year to date:
    • Site-wide: 61,955
  • Since beginning of last month:
    • Site-wide: 47,659

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide