Detecting genome-wide directional effects of transcription factor binding on polygenic disease risk
Yakir A. Reshef,
David R. Kelley,
Jacob C Ulirsch,
Bryce van de Geijn,
Pier Francesco Palamara,
Ryan P Adams,
Alkes L. Price
Posted 17 Oct 2017
bioRxiv DOI: 10.1101/204685 (published DOI: 10.1038/s41588-018-0196-7)
Posted 17 Oct 2017
Biological interpretation of GWAS data frequently involves analyzing unsigned genomic annotations comprising SNPs involved in a biological process and assessing enrichment for disease signal. However, it is often possible to generate signed annotations quantifying whether each SNP allele promotes or hinders a biological process, e.g., binding of a transcription factor (TF). Directional effects of such annotations on disease risk enable stronger statements about causal mechanisms of disease than enrichments of corresponding unsigned annotations. Here we introduce a new method, signed LD profile regression, for detecting such directional effects using GWAS summary statistics, and we apply the method using 382 signed annotations reflecting predicted TF binding. We show via theory and simulations that our method is well-powered and is well-calibrated even when TF binding sites co-localize with other enriched regulatory elements, which can confound unsigned enrichment methods. We further validate our method by showing that it recovers known transcriptional regulators when applied to molecular QTL in blood. We then apply our method to eQTL in 48 GTEx tissues, identifying 651 distinct TF-tissue expression associations at per-tissue FDR<5%, including 30 associations with robust evidence of tissue specificity. Finally, we apply our method to 46 diseases and complex traits (average N=289,617) and identify 77 annotation-trait associations at per-trait FDR<5% representing 12 independent TF-trait associations, and we conduct gene-set enrichment analyses to characterize the underlying transcriptional programs. Our results implicate new causal disease genes (including causal genes at known GWAS loci), and in some cases suggest a detailed mechanism for a causal gene's effect on disease. Our method provides a new way to leverage functional data to draw inferences about disease etiology.
- Downloaded 2,428 times
- Download rankings, all-time:
- Site-wide: 2,618 out of 88,995
- In genetics: 207 out of 4,605
- Year to date:
- Site-wide: 44,899 out of 88,995
- Since beginning of last month:
- Site-wide: 48,865 out of 88,995
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!