Rxivist logo

Improved analyses of GWAS summary statistics by reducing data heterogeneity and errors

By Wenhan Chen, Yang Wu, Zhili Zheng, Ting Qi, Peter M Visscher, Zhihong Zhu, Jian Yang

Posted 12 Jul 2020
bioRxiv DOI: 10.1101/2020.07.09.196535

Summary statistics from genome-wide association studies (GWAS) have facilitated the development of various summary data-based methods, which typically require a reference sample for linkage disequilibrium (LD) estimation. Analyses using these methods may be biased by errors in GWAS summary data and heterogeneity between GWAS and LD reference. Here we propose a quality control method, DENTIST, that leverages LD among genetic variants to detect and eliminate errors in GWAS or LD reference and heterogeneity between the two. Through simulations, we demonstrate that DENTIST substantially reduces false-positive rate (FPR) in detecting secondary signals in the summary-data-based conditional and joint (COJO) association analysis, especially for imputed rare variants (FPR reduced from >28% to <2% in the presence of heterogeneity between GWAS and LD reference). We further show that DENTIST can improve other summary-data-based analyses such as fine-mapping analysis, and integrative analysis of GWAS and expression quantitative trait locus data.

Download data

  • Downloaded 1,006 times
  • Download rankings, all-time:
    • Site-wide: 24,226
    • In genetics: 1,111
  • Year to date:
    • Site-wide: 11,287
  • Since beginning of last month:
    • Site-wide: 90,315

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide