Bayesian reassessment of the epigenetic architecture of complex traits
Daniel Trejo Banos,
Daniel L. McCartney,
Rosie M. Walker,
Stewart W Morris,
David J. Porteous,
Allan F. McRae,
Naomi R Wray,
Peter M Visscher,
Chris S. Haley,
Kathryn L Evans,
Andrew M. McIntosh,
Riccardo E. Marioni,
Matthew R. Robinson
Posted 22 Oct 2018
bioRxiv DOI: 10.1101/450288
Posted 22 Oct 2018
Epigenetic DNA modification is partly under genetic control, and occurs in response to a wide range of environmental exposures. Linking epigenetic marks to clinical outcomes may provide greater insight into underlying molecular processes of disease, assist in the identification of therapeutic targets, and improve risk prediction. Here, we present a statistical approach, based on Bayesian inference, that estimates associations between disease risk and all measured epigenetic probes jointly, automatically controlling for both data structure (including cell-count effects, relatedness, and experimental batch effects) and correlations among probes. We benchmark our approach in simulation study, finding improved estimation of probe associations across a wide range of scenarios over existing approaches. Our method estimates the total proportion of disease risk captured by epigenetic probe variation, and when we applied it to measures of body mass index (BMI) and cigarette consumption behaviour in 5,101 individuals, we find that 66.7% (95% CI 60.0-72.8) of the variation in BMI and 67.7% (95% CI 58.4-76.9) of the variation in cigarette consumption can be captured by methylation array data from whole blood, independent of the variation explained by single nucleotide polymorphism markers. We find novel associations, with smoking behaviour associated with a methylation probe at the MNDA gene with >95% posterior inclusion probability, which is a myeloid cell nuclear differentiation antigen gene previously implicated as a biomarker for inflammation and non-Hodgkin lymphoma risk. We conduct unique genome-wide enrichment analyses, identifying blood cholesterol, lipid transport and sterol metabolism pathways for BMI, and response to xenobiotic stimulus and negative regulation of RNA polymerase II promoter transcription for smoking, all with >95% posterior inclusion probability of having methylation probes with associations >1.5 times larger than the average. Finally, we improve phenotypic prediction in two independent cohorts by 28.7% and 10.2% for BMI and smoking respectively over a LASSO model. These results imply that probe measures may capture large amounts of variance because they are likely a consequence of the phenotype rather than a cause. As a result, omics data may enable accurate characterization of disease progression and identification of individuals who are on a path to disease. Our approach facilitates better understanding of the underlying epigenetic architecture of complex common disease and is applicable to any kind of genomics data.
- Downloaded 1,255 times
- Download rankings, all-time:
- Site-wide: 8,449 out of 94,912
- In genomics: 1,267 out of 5,955
- Year to date:
- Site-wide: 4,181 out of 94,912
- Since beginning of last month:
- Site-wide: 38,730 out of 94,912
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!