Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 59,937 bioRxiv papers from 266,417 authors.

Genome-wide rare variant analysis for thousands of phenotypes in 54,000 exomes

By Elizabeth T. Cirulli, Simon White, Robert W Read, Gai Elhanan, William J Metcalf, Karen A Schlauch, Joseph J Grzymski, James Lu, Nicole L Washington

Posted 04 Jul 2019
bioRxiv DOI: 10.1101/692368

Defining the effects that rare variants can have on human phenotypes is essential to advancing our understanding of human health and disease. Large-scale human genetic analyses have thus far focused on common variants, but the development of large cohorts of deeply phenotyped individuals with exome sequence data has now made comprehensive analyses of rare variants possible. We analyzed the effects of rare (MAF<0.1%) variants on 3,166 phenotypes in 40,468 exome-sequenced individuals from the UK Biobank and performed replication as well as meta-analyses with 1,067 phenotypes in 13,470 members of the Healthy Nevada Project (HNP) cohort who underwent Exome+ sequencing at Helix. Our analyses of non-benign coding and loss of function (LoF) variants identified 78 gene-based associations that passed our statistical significance threshold (p<5x10-9). These are associations in which carrying any rare coding or LoF variant in the gene is associated with an enrichment for a specific phenotype, as opposed to GWAS-based associations of strictly single variants. Importantly, our results do not suffer from the test statistic inflation that is often seen with rare variant analyses of biobank-scale data because of our rare variant-tailored methodology, which includes a step that optimizes the carrier frequency threshold for each phenotype based on prevalence. Of the 47 discovery associations whose phenotypes were represented in the replication cohort, 98% showed effects in the expected direction, and 45% attained formal replication significance (p<0.001). Six additional significant associations were identified in our meta-analysis of both cohorts. Among the results, we confirm known associations of PCSK9 and APOB variation with LDL levels; we extend knowledge of variation in the TYRP1 gene, previously associated with blonde hair color only in Solomon Islanders to blonde hair color in individuals of European ancestry; we show that PAPPA, a gene in which common variants had previously associated with height via GWAS, contains rare variants that decrease height; and we make the novel discovery that STAB1 variation is associated with blood flow in the brain. Our results are available for download and interactive browsing in an app (https://ukb.research.helix.com). This comprehensive analysis of the effects of rare variants on human phenotypes marks one of the first steps in the next big phase of human genetics, where large, deeply phenotyped cohorts with next generation sequence data will elucidate the effects of rare variants.

Download data

  • Downloaded 963 times
  • Download rankings, all-time:
    • Site-wide: 7,194 out of 59,937
    • In genetics: 567 out of 3,412
  • Year to date:
    • Site-wide: 1,363 out of 59,937
  • Since beginning of last month:
    • Site-wide: 2,525 out of 59,937

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News