Rxivist logo

Learning polygenic scores for human blood cell traits

By Yu Xu, Dragana Vuckovic, Scott C Ritchie, Parsa Akbari, Tao Jiang, Jason Grealey, Adam S Butterworth, Willem H. Ouwehand, David J. Roberts, Emanuele Di Angelantonio, John Danesh, Nicole Soranzo, Michael Inouye

Posted 18 Feb 2020
bioRxiv DOI: 10.1101/2020.02.17.952788

Polygenic scores (PGSs) for blood cell traits can be constructed using summary statistics from genome-wide association studies. As the selection of variants and the modelling of their interactions in PGSs may be limited by univariate analysis, therefore, such a conventional method may yield sub-optional performance. This study evaluated the relative effectiveness of four machine learning and deep learning methods, as well as a univariate method, in the construction of PGSs for 26 blood cell traits, using data from UK Biobank (n=~400,000) and INTERVAL (n=~40,000). Our results showed that learning methods can improve PGSs construction for nearly every blood cell trait considered, with this superiority explained by the ability of machine learning methods to capture interactions among variants. This study also demonstrated that populations can be well stratified by the PGSs of these blood cell traits, even for traits that exhibit large differences between ages and sexes, suggesting potential for disease prevention. As our study found genetic correlations between the PGSs for blood cell traits and PGSs for several common human diseases (recapitulating well-known associations between the blood cell traits themselves and certain diseases), it suggests that blood cell traits may be indicators or/and mediators for a variety of common disorders via shared genetic variants and functional pathways.

Download data

  • Downloaded 1,571 times
  • Download rankings, all-time:
    • Site-wide: 11,523
    • In genetics: 539
  • Year to date:
    • Site-wide: 18,836
  • Since beginning of last month:
    • Site-wide: 25,017

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide