Rxivist logo

BoostMe accurately predicts DNA methylation values in whole-genome bisulfite sequencing of multiple human tissues

By Luli S. Zou, Michael R Erdos, D. Leland Taylor, Peter S. Chines, Arushi Varshney, The McDonnell Genome Institute, Stephen C.J. Parker, Francis S Collins, John P Didion

Posted 23 Oct 2017
bioRxiv DOI: 10.1101/207506 (published DOI: 10.1186/s12864-018-4766-y)

Bisulfite sequencing is widely employed to study the role of DNA methylation in disease; however, the data suffer from biases due to coverage depth variability. Here we describe BoostMe, a method for imputing low quality DNA methylation estimates within whole-genome bisulfite sequencing (WGBS) data. BoostMe uses a gradient boosting algorithm, XGBoost, and leverages information from multiple samples for prediction. We find that BoostMe outperforms existing algorithms in speed and accuracy when applied to WGBS of human tissues. We also show that imputation improves concordance between WGBS and the MethylationEPIC array at low WGBS depth, suggesting improved WGBS accuracy after imputation.

Download data

  • Downloaded 1,219 times
  • Download rankings, all-time:
    • Site-wide: 17,959
    • In genomics: 1,763
  • Year to date:
    • Site-wide: 66,116
  • Since beginning of last month:
    • Site-wide: 50,421

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide