Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 65,016 bioRxiv papers from 288,163 authors.

Assessing the pathogenicity, penetrance and expressivity of putative disease-causing variants in a population setting

By Caroline F Wright, Ben West, Marcus Tuke, Samuel E Jones, Kashyap Patel, Thomas W Laver, R. N. Beaumont, Jessica Tyrrell, Andrew R Wood, Timothy M Frayling, Andrew T Hattersley, Michael N Weedon

Posted 04 Sep 2018
bioRxiv DOI: 10.1101/407981 (published DOI: 10.1016/j.ajhg.2018.12.015)

Over 100,000 genetic variants are classified as disease-causing in public databases. However, the true penetrance of many of these rare alleles is uncertain and may be over-estimated by clinical ascertainment. As more people undergo genome sequencing there is an increasing need to assess the true penetrance of alleles. Until recently, this was not possible in a population-based setting. Here, we use data from 388,714 UK Biobank (UKB) participants of European ancestry to assess the pathogenicity and penetrance of putatively clinically important rare variants. Although rare variants are harder to genotype accurately than common variants, we were able to classify 1,244 of 4,585 (27%) putatively clinically relevant rare variants genotyped on the UKB microarray as high-quality. We defined 'rare' as variants with a minor allele frequency of <0.01, and 'clinically relevant' as variants that were either classified as pathogenic/likely pathogenic in ClinVar or are in genes known to cause two specific monogenic diseases in which we have some expertise: Maturity-Onset Diabetes of the Young (MODY) and severe developmental disorders (DD). We assessed the penetrance and pathogenicity of these high-quality variants by testing their association with 401 clinically-relevant traits available in UKB. We identified 27 putatively clinically relevant rare variants associated with a UKB trait but that exhibited reduced penetrance or variable expressivity compared with their associated disease. For example, the P415A PER3 variant that has been reported to cause familial advanced sleep phase syndrome is present at 0.5% frequency in the population and associated with an odds ratio of 1.38 for being a morning person (P=2x10-18). We also observed novel associations with relevant traits for heterozygous carriers of some rare recessive conditions, e.g. heterozygous carriers of the R799W ERCC4 variant that causes Xeroderma pigmentosum were more susceptible to sunburn (one extra sunburn episode reported, P=2x10-8). Within our two disease subsets, we were able to refine the penetrance estimate for the R114W HNF4A variant in diabetes (only ~10% by age 40yrs) and refute the previous disease-association of RNF135 in developmental disorders. In conclusion, this study shows that very large population-based studies will help refine the penetrance estimates of rare variants. This information will be important for anyone receiving information about their health based on putatively pathogenic variants.

Download data

  • Downloaded 1,378 times
  • Download rankings, all-time:
    • Site-wide: 4,383 out of 65,016
    • In genetics: 367 out of 3,688
  • Year to date:
    • Site-wide: 3,899 out of 65,016
  • Since beginning of last month:
    • Site-wide: 8,360 out of 65,016

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News