Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 62,719 bioRxiv papers from 278,291 authors.

Discovering epistatic feature interactions from neural network models of regulatory DNA sequences

By Peyton Greenside, Tyler Shimko, Polly Fordyce, Anshul Kundaje

Posted 17 Apr 2018
bioRxiv DOI: 10.1101/302711 (published DOI: 10.1093/bioinformatics/bty575)

Motivation: Transcription factors bind regulatory DNA sequences in a combinatorial manner to modulate gene expression. Deep neural networks (DNNs) can learn the cis-regulatory grammars encoded in regulatory DNA sequences associated with transcription factor binding and chromatin accessibility. Several feature attribution methods have been developed for estimating the predictive importance of individual features (nucleotides or motifs) in any input DNA sequence to its associated output prediction from a DNN model. However, these methods do not reveal higher-order feature interactions encoded by the models. Results: We present a new method called Deep Feature Interaction Maps (DFIM) to efficiently estimate interactions between all pairs of features in any input DNA sequence. DFIM accurately identifies ground truth motif interactions embedded in simulated regulatory DNA sequences. DFIM identifies synergistic interactions between GATA1 and TAL1 motifs from in vivo TF binding models. DFIM reveals epistatic interactions involving nucleotides flanking the core motif of the Cbf1 TF in yeast from in vitro TF binding models. We also apply DFIM to regulatory sequence models of in vivo chromatin accessibility to reveal interactions between regulatory genetic variants and proximal motifs of target TFs as validated by TF binding quantitative trait loci. Our approach makes significant strides in improving the interpretability of deep learning models for genomics. Availability: Code is available at: https://github.com/kundajelab/dfim

Download data

  • Downloaded 1,542 times
  • Download rankings, all-time:
    • Site-wide: 3,472 out of 62,719
    • In bioinformatics: 702 out of 6,243
  • Year to date:
    • Site-wide: 13,784 out of 62,719
  • Since beginning of last month:
    • Site-wide: 28,959 out of 62,719

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide

Sign up for the Rxivist weekly newsletter! (Click here for more details.)