Rxivist logo

Equivalence of LD-Score Regression and Individual-Level-Data Methods

By Ronald de Vlaming, Magnus Johannesson, Patrik KE Magnusson, Mohammad Arfan Ikram, Peter M Visscher

Posted 31 Oct 2017
bioRxiv DOI: 10.1101/211821

LD-score (LDSC) regression disentangles the contribution of polygenic signal, in terms of SNP-based heritability, and population stratification, in terms of a so-called intercept, to GWAS test statistics. Whereas LDSC regression uses summary statistics, methods like Haseman-Elston (HE) regression and genomic-relatedness-matrix (GRM) restricted maximum likelihood infer parameters such as SNP-based heritability from individual-level data directly. Therefore, these two types of methods are typically considered to be profoundly different. Nevertheless, recent work has revealed that LDSC and HE regression yield near-identical SNP-based heritability estimates when confounding stratification is absent. We now extend the equivalence; under the stratification assumed by LDSC regression, we show that the intercept can be estimated from individual-level data by transforming the coefficients of a regression of the phenotype on the leading principal components from the GRM. Using simulations, considering various degrees and forms of population stratification, we find that intercept estimates obtained from individual-level data are nearly equivalent to estimates from LDSC regression (R2 > 99%). An empirical application corroborates these findings. Hence, LDSC regression is not profoundly different from methods using individual-level data; parameters that are identified by LDSC regression are also identified by methods using individual-level data. In addition, our results indicate that, under strong stratification, there is misattribution of stratification to the slope of LDSC regression, inflating estimates of SNP-based heritability from LDSC regression ceteris paribus. Hence, the intercept is not a panacea for population stratification. Consequently, LDSC-regression estimates should be interpreted with caution, especially when the intercept estimate is significantly greater than one.

Download data

  • Downloaded 1,655 times
  • Download rankings, all-time:
    • Site-wide: 12,580
    • In genetics: 551
  • Year to date:
    • Site-wide: 34,461
  • Since beginning of last month:
    • Site-wide: 29,785

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide