Rxivist logo

Estimating inflation in GWAS summary statistics due to variance distortion from cryptic relatedness

By Dominic Holland, Chun-Chieh Fan, Oleksandr Frei, Alexey A. Shadrin, Olav B. Smeland, V. S. Sundar, Ole A Andreassen, Anders Dale

Posted 17 Jul 2017
bioRxiv DOI: 10.1101/164939

Cryptic relatedness is inherently a feature of large genome-wide association studies (GWAS), and can give rise to considerable inflation in summary statistics for single nucleotide polymorphism (SNP) associations with phenotypes. It has proven difficult to disentangle these inflationary effects from true polygenic effects. Here we present results of a model that enables estimation of polygenicity, mean strength of association, and residual inflation in GWAS summary statistics. We show that there is substantial residual inflation in recent large GWAS of height and schizophrenia; correcting for this reduces the number of independent genome-wide significant loci from the reported values of 697 for height and 108 for schizophrenia to 368 and 61, respectively. In contrast, a larger GWAS of educational attainment shows no residual inflation. Additionally, we find that height has a relatively low polygenicity, with approximately 8k SNPs having causal association, more than an order of magnitude less than has been reported. The residual inflation in GWAS summary statistics can be corrected using the standard genomic control procedure with the estimated residual inflation factor.

Download data

  • Downloaded 2,516 times
  • Download rankings, all-time:
    • Site-wide: 9,175
    • In genomics: 865
  • Year to date:
    • Site-wide: 27,167
  • Since beginning of last month:
    • Site-wide: 36,243

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide