Rxivist logo

Generating realistic null hypothesis of cancer mutational landscapes using SigProfilerSimulator

By Erik N. Bergstrom, Mark Barnes, IƱigo Martincorena, Ludmil B. Alexandrov

Posted 14 Feb 2020
bioRxiv DOI: 10.1101/2020.02.13.948422

Performing a statistical test requires a null hypothesis. In cancer genomics, a key challenge is the fast generation of accurate somatic mutational landscapes that can be used as a realistic null hypothesis for making biological discoveries. Here we present SigProfilerSimulator, a powerful tool that is capable of simulating the mutational landscapes of thousands of cancer genomes at different resolutions within seconds. Applying SigProfilerSimulator to 2,144 whole-genome sequenced cancers reveals: (i) that most doublet base substitutions are not due to two adjacent single base substitutions but likely occur as single genomic events; (ii) that an extended sequencing context of +/-2bp is required to more completely capture the patterns of substitution mutational signatures in human cancer; (iii) information on false-positive discovery rate of commonly used bioinformatics tools for detecting driver genes. SigProfilerSimulator's breadth of features allows one to construct a tailored null hypothesis and use it for evaluating the accuracy of other bioinformatics tools or for downstream statistical analysis for biological discoveries.

Download data

  • Downloaded 277 times
  • Download rankings, all-time:
    • Site-wide: 56,186 out of 88,741
    • In genomics: 4,487 out of 5,665
  • Year to date:
    • Site-wide: 9,570 out of 88,741
  • Since beginning of last month:
    • Site-wide: 25,180 out of 88,741

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)