Constraints on eQTL fine mapping in the presence of multi-site local regulation of gene expression
Luke R. Lloyd-Jones,
Urko M Marigorta,
Grant W Montgomery,
Kenneth L Brigham,
Arshed A Quyyumi,
Peter M Visscher,
Joseph E. Powell,
Posted 29 Oct 2016
bioRxiv DOI: 10.1101/084293 (published DOI: 10.1534/g3.117.043752)
Posted 29 Oct 2016
Expression QTL (eQTL) detection has emerged as an important tool for unravelling of the relationship between genetic risk factors and disease or clinical phenotypes. Most studies use single marker linear regression to discover primary signals, followed by sequential conditional modeling to detect secondary genetic variants affecting gene expression. However, this approach assumes that functional variants are sparsely distributed and that close linkage between them has little impact on estimation of their precise location and magnitude of effects. In this study, we address the prevalence of secondary signals and bias in estimation of their effects by performing multi-site linear regression on two large human cohort peripheral blood gene expression datasets (each greater than 2,500 samples) with accompanying whole genome genotypes, namely the CAGE compendium of Illumina microarray studies, and the Framingham Heart Study Affymetrix data. Stepwise conditional modeling demonstrates that multiple eQTL signals are present for ~40% of over 3500 eGenes in both datasets, and the number of loci with additional signals reduces by approximately two-thirds with each conditioning step. However, the concordance of specific signals between the two studies is only ~30%, indicating that expression profiling platform is a large source of variance in effect estimation. Furthermore, a series of simulation studies imply that in the presence of multi-site regulation, up to 10% of the secondary signals could be artefacts of incomplete tagging, and at least 5% but up to one quarter of credible intervals may not even include the causal site, which is thus mis-localized. Joint multi-site effect estimation recalibrates effect size estimates by just a small amount on average. Presumably similar conclusions apply to most types of quantitative trait. Given the strong empirical evidence that gene expression is commonly regulated by more than one variant, we conclude that the fine-mapping of causal variants needs to be adjusted for multi-site influences, as conditional estimates can be highly biased by interference among linked sites.
- Downloaded 651 times
- Download rankings, all-time:
- Site-wide: 37,997
- In genetics: 1,809
- Year to date:
- Site-wide: 122,506
- Since beginning of last month:
- Site-wide: 108,825
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!