Winners curse correction and variable thresholding improve performance of polygenic risk modeling based on summary-level data from genome-wide association studies
Chao Agnes Hsiung,
Victoria K. Cortessis,
Margaret R Karagas,
Neil E. Caporaso,
Alison P. Klein,
Alan R Sanders,
Robert E. Schoen,
MGS (Molecular Genetics of Schizophrenia) GWAS Consortium,
GECCO (The Genetics and Epidemiology of Colorectal Cancer Consortium),
The GAME-ON/TRICL (Transdisciplinary Research in Cancer of the Lung) GWAS Consortium,
PRACTICAL (PRostate cancer AssoCiation group To Investigate Cancer Associated aLterations) Consortium,
PanScan and PanC4 Consortium,
The GAMEON/ELLIPSE Consortium,
Laufey T Amundadottir,
Maria Teresa Landi,
Douglas F Levinson,
Stephen J. Chanock,
Posted 10 Jan 2016
bioRxiv DOI: 10.1101/034082
Posted 10 Jan 2016
Heritability analysis suggests that genome-wide association studies (GWAS) have the potential to improve genetic risk prediction for complex diseases. Polygenic risk-score (PRS) is a widely used modelling technique that requires only availability of summary-level data from the discovery samples. We propose two modifications to improve the performance of PRS. First, we propose threshold dependent winners curse adjustments for marginal association coefficients that are used to weight the SNPs in PRS. Second, to exploit various external functional/annotation knowledge that might identify subset of SNPs highly enriched for association signals, we consider using variable thresholds for SNPs selection. We applied our methods to the GWAS summary-level data of fourteen complex diseases. Our analysis shows that while a simple winners curse correction uniformly leads to enhancement of performance of the models across traits, incorporation of functional SNPs was beneficial for only selected traits. Compared to standard PRS algorithm, the proposed methods in combination leads to substantial efficiency gain (25-50% increase in the prediction R2) for five out of fifteen diseases. As an example, for GWAS of type 2 diabetes, the lasso-based winners curse correction improves prediction R2 from 2.29% based on standard PRS to 3.1% (P=0.0017) and incorporating functional annotation data further improved R2 to 3.53% (P=2.0E-5). Our simulation studies provided further clarification why differential treatment of certain category of functional SNPs, even when shown to be highly enriched for GWAS-heritability, does not lead to proportionate improvement in genetic risk-prediction due to non-uniform linkage disequilibrium structure.
- Downloaded 945 times
- Download rankings, all-time:
- Site-wide: 25,073
- In genetics: 1,179
- Year to date:
- Site-wide: 109,054
- Since beginning of last month:
- Site-wide: 91,329
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!