Rxivist logo

Fast estimation of genetic correlation for Biobank-scale data

By Yue Wu, Kathryn S. Burch, andrea ganna, Paivi Pajukanta, Bogdan Pasaniuc, Sriram Sankararaman

Posted 20 Jan 2019
bioRxiv DOI: 10.1101/525055

Genetic correlation is an important parameter in efforts to understand the relationships among complex traits. Current methods that analyze individual genotype data for estimating genetic correlation are challenging to scale to large datasets. Methods that analyze summary data, while being computationally efficient, tend to yield estimates of genetic correlation with reduced precision. We propose SCORE, a randomized method of moments estimator of genetic correlation that is both scalable and accurate. SCORE obtains more precise estimates of genetic correlations relative to summary-statistic methods that can be applied at scale achieving a 50% reduction in standard error relative to LD-score regression (LDSC) and a 26% reduction relative to high-definition likelihood (HDL) (averaged over all simulations).The efficiency of SCORE enables computation of genetic correlations on the UK biobank dataset consisting of ~300K individuals and ~500K SNPs in a few hours (orders of magnitude faster than methods that analyze individual data such as GCTA). Across 780 pairs of traits in 291,273 unrelated white British individuals in the UK Biobank, SCORE identifies significant genetic correlations between 200 additional pairs of traits over LDSC (beyond the 245 pairs identified by both).

Download data

  • Downloaded 1,059 times
  • Download rankings, all-time:
    • Site-wide: 24,134
    • In bioinformatics: 2,739
  • Year to date:
    • Site-wide: 30,458
  • Since beginning of last month:
    • Site-wide: 34,378

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide