Rxivist logo

CHIPS: A Snakemake pipeline for quality control and reproducible processing of chromatin profiling data

By Len Taing, Clara Cousins, Gali Bai, Cejas Paloma, Xintao Qiu, Myles Brown, Clifford A Meyer, X. Shirley Liu, Henry Long, Ming Tang

Posted 10 Mar 2021
bioRxiv DOI: 10.1101/2021.03.09.434676

Motivation: The chromatin profile measured by ATAC-seq, ChIP-seq, or DNase-seq experiments can identify genomic regions critical in regulating gene expression and provide insights on biological processes such as diseases and development. However, quality control and processing chromatin profiling data involve many steps, and different bioinformatics tools are used at each step. It can be challenging to manage the analysis. Results: We developed a Snakemake pipeline called CHIPS (CHromatin enrichment Processor) to streamline the processing of ChIP-seq, ATAC-seq, and DNase-seq data. The pipeline supports single- and paired-end data and is flexible to start with FASTQ or BAM files. It includes basic steps such as read trimming, mapping, and peak calling. In addition, it calculates quality control metrics such as contamination profiles, PCR bottleneck coefficient, the fraction of reads in peaks, percentage of peaks overlapping with the union of public DNaseI hypersensitivity sites, and conservation profile of the peaks. For downstream analysis, it carries out peak annotations, motif finding, and regulatory potential calculation for all genes. The pipeline ensures that the processing is robust and reproducible. Availability: CHIPS is available at https://github.com/liulab-dfci/CHIPS Contact: mtang@ds.dfci.harvard.edu; henry_long@dfci.harvard.edu

Download data

  • Downloaded 541 times
  • Download rankings, all-time:
    • Site-wide: 68,895
    • In bioinformatics: 6,521
  • Year to date:
    • Site-wide: 124,031
  • Since beginning of last month:
    • Site-wide: 44,279

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide