Rxivist logo

Leveraging epigenomes and three-dimensional genome organization for interpreting regulatory variation

By Brittany Anne Baur, Junha Shin, Jacob Schreiber, Shilu Zhang, Yi Zhang, Mohith Manjunath, Jun S. Song, William Stafford Noble, Sushmita Roy

Posted 30 Aug 2021
bioRxiv DOI: 10.1101/2021.08.29.458098

Understanding the impact of regulatory variants on complex phenotypes is a significant challenge because the genes and pathways that are targeted by such variants are typically unknown. Furthermore, a regulatory variant might influence a particular gene's expression in a cell type or tissue-specific manner. Cell-type specific long-range regulatory interactions that occur between a distal regulatory sequence and a gene offers a powerful framework for understanding the impact of regulatory variants on complex phenotypes. However, high-resolution maps of such long-range interactions are available only for a handful of model cell lines. To address this challenge, we have developed L-HiC-Reg, a Random Forests based regression method to predict high-resolution contact counts in new cell lines, and a network-based framework to identify candidate cell line-specific gene networks targeted by a set of variants from a Genome-wide association study (GWAS). We applied our approach to predict interactions in 55 Roadmap Epigenome Consortium cell lines, which we used to interpret regulatory SNPs in the NHGRI GWAS catalogue. Using our approach, we performed an in-depth characterization of fifteen different phenotypes including Schizophrenia, Coronary Artery Disease (CAD) and Crohn's disease. In CAD, we found differentially wired subnetworks consisting of known as well as novel gene targets of regulatory SNPs. Taken together, our compendium of interactions and associated network-based analysis pipeline offers a powerful resource to leverage long-range regulatory interactions to examine the context-specific impact of regulatory variation in complex phenotypes.

Download data

  • Downloaded 186 times
  • Download rankings, all-time:
    • Site-wide: 136,472
    • In bioinformatics: 10,685
  • Year to date:
    • Site-wide: 48,423
  • Since beginning of last month:
    • Site-wide: 8,737

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide