Rxivist logo

Denoising genome-wide histone ChIP-seq with convolutional neural networks

By Pang Wei Koh, Emma Pierson, Anshul Kundaje

Posted 07 May 2016
bioRxiv DOI: 10.1101/052118 (published DOI: 10.1093/bioinformatics/btx243)

Motivation: Chromatin immunoprecipitation sequencing (ChIP-seq) experiments are commonly used to obtain genome-wide profiles of histone modifications associated with different types of functional genomic elements. However, the quality of histone ChIP-seq data is affected by a myriad of experimental parameters such as the amount of input DNA, antibody specificity, ChIP enrichment, and sequencing depth. Making accurate inferences from chromatin profiling experiments that involve diverse experimental parameters is challenging. Results: We introduce a convolutional denoising algorithm, Coda, that uses convolutional neural networks to learn a mapping from suboptimal to high-quality histone ChIP-seq data. This overcomes various sources of noise and variability, substantially enhancing and recovering signal when applied to low-quality chromatin profiling datasets across individuals, cell types, and species. Our method has the potential to improve data quality at reduced costs. More broadly, this approach -- using a high-dimensional discriminative model to encode a generative noise process -- is generally applicable to other biological domains where it is easy to generate noisy data but difficult to analytically characterize the noise or underlying data distribution. Availability: https://github.com/kundajelab/coda

Download data

  • Downloaded 3,304 times
  • Download rankings, all-time:
    • Site-wide: 1,433 out of 84,043
    • In bioinformatics: 263 out of 8,055
  • Year to date:
    • Site-wide: 23,467 out of 84,043
  • Since beginning of last month:
    • Site-wide: 31,303 out of 84,043

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)