Assessing the pathogenicity, penetrance and expressivity of putative disease-causing variants in a population setting
Caroline F. Wright,
Samuel E. Jones,
Thomas W Laver,
R. N. Beaumont,
Andrew R. Wood,
Timothy M. Frayling,
Andrew T Hattersley,
Posted 04 Sep 2018
bioRxiv DOI: 10.1101/407981 (published DOI: 10.1016/j.ajhg.2018.12.015)
Posted 04 Sep 2018
Over 100,000 genetic variants are classified as disease-causing in public databases. However, the true penetrance of many of these rare alleles is uncertain and may be over-estimated by clinical ascertainment. As more people undergo genome sequencing there is an increasing need to assess the true penetrance of alleles. Until recently, this was not possible in a population-based setting. Here, we use data from 388,714 UK Biobank (UKB) participants of European ancestry to assess the pathogenicity and penetrance of putatively clinically important rare variants. Although rare variants are harder to genotype accurately than common variants, we were able to classify 1,244 of 4,585 (27%) putatively clinically relevant rare variants genotyped on the UKB microarray as high-quality. We defined 'rare' as variants with a minor allele frequency of <0.01, and 'clinically relevant' as variants that were either classified as pathogenic/likely pathogenic in ClinVar or are in genes known to cause two specific monogenic diseases in which we have some expertise: Maturity-Onset Diabetes of the Young (MODY) and severe developmental disorders (DD). We assessed the penetrance and pathogenicity of these high-quality variants by testing their association with 401 clinically-relevant traits available in UKB. We identified 27 putatively clinically relevant rare variants associated with a UKB trait but that exhibited reduced penetrance or variable expressivity compared with their associated disease. For example, the P415A PER3 variant that has been reported to cause familial advanced sleep phase syndrome is present at 0.5% frequency in the population and associated with an odds ratio of 1.38 for being a morning person (P=2x10-18). We also observed novel associations with relevant traits for heterozygous carriers of some rare recessive conditions, e.g. heterozygous carriers of the R799W ERCC4 variant that causes Xeroderma pigmentosum were more susceptible to sunburn (one extra sunburn episode reported, P=2x10-8). Within our two disease subsets, we were able to refine the penetrance estimate for the R114W HNF4A variant in diabetes (only ~10% by age 40yrs) and refute the previous disease-association of RNF135 in developmental disorders. In conclusion, this study shows that very large population-based studies will help refine the penetrance estimates of rare variants. This information will be important for anyone receiving information about their health based on putatively pathogenic variants.
- Downloaded 1,628 times
- Download rankings, all-time:
- Site-wide: 4,778 out of 84,241
- In genetics: 360 out of 4,422
- Year to date:
- Site-wide: 12,481 out of 84,241
- Since beginning of last month:
- Site-wide: 8,625 out of 84,241
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!