Developing a network view of type 2 diabetes risk pathways through integration of genetic, genomic and functional data
Kyle J Gaulton,
Martijn van de Bunt,
Anna L Gloyn,
Mark I McCarthy
Posted 20 Jun 2018
bioRxiv DOI: 10.1101/350181 (published DOI: 10.1186/s13073-019-0628-8)
Posted 20 Jun 2018
Genome wide association studies (GWAS) have identified several hundred susceptibility loci for Type 2 Diabetes (T2D). One critical, but unresolved, issue concerns the extent to which the mechanisms through which these diverse signals influencing T2D predisposition converge on a limited set of biological processes. However, the causal variants identified by GWAS mostly fall into non-coding sequence, complicating the task of defining the effector transcripts through which they operate. Here, we describe implementation of an analytical pipeline to address this question. First, we integrate multiple sources of genetic, genomic, and biological data to assign positional candidacy scores to the genes that map to T2D GWAS signals. Second, we introduce genes with high scores as seeds within a network optimization algorithm (the asymmetric prize-collecting Steiner Tree approach) which uses external, experimentally-confirmed protein-protein interaction (PPI) data to generate high confidence subnetworks. Third, we use GWAS data to test the T2D-association enrichment of the "non-seed" proteins introduced into the network, as a measure of the overall functional connectivity of the network. We find: (a) non-seed proteins in the T2D protein-interaction network generated (comprising 705 nodes) are enriched for association to T2D (p=0.0014) but not control traits; (b) stronger T2D-enrichment for islets than other tissues when we use RNA expression data to generate tissue-specific PPI networks ; and (c) enhanced enrichment (p=3.9x10-5) when we combine analysis of the islet-specific PPI network with a focus on the subset of T2D GWAS loci which act through defective insulin secretion. These analyses reveal a pattern of non-random functional connectivity between causal candidate genes at T2D GWAS loci, and highlight the products of genes including YWHAG, SMAD4 or CDK2 as contributors to T2D-relevant islet dysfunction. The approach we describe can be applied to other complex genetic and genomic data sets, facilitating integration of diverse data types into disease-associated networks
- Downloaded 1,337 times
- Download rankings, all-time:
- Site-wide: 7,072 out of 89,194
- In bioinformatics: 1,253 out of 8,418
- Year to date:
- Site-wide: 52,794 out of 89,194
- Since beginning of last month:
- Site-wide: 47,568 out of 89,194
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!