Uncovering novel mutational signatures by de novo extraction with SigProfilerExtractor
By
S M Ashiqul Islam,
Yang Wu,
Marcos Diaz-Gay,
Erik N. Bergstrom,
Yudou He,
Mark Barnes,
Mike Vella,
Jingwei Wang,
Jon W Teague,
Peter Clapham,
Sarah Moody,
Sergey Senkin,
Yun Rose Li,
Laura Riva,
Tongwu Zhang,
Andreas J. Gruber,
Raviteja Vangara,
Christopher D Steele,
Burcak Otlu,
Azhar Khandekar,
Ammal Abbasi,
Laura Humphreys,
Natalia Syulyukina,
Samuel W Brady,
Boian Stoianov Alexandrov,
Nischalan Pillay,
Jinghui Zhang,
David J. Adams,
Inigo Marticorena,
David C Wedge,
Maria Teresa Landi,
Paul Brennan,
Michael R. Stratton,
Steven G Rozen,
Ludmil B. Alexandrov
Posted 13 Dec 2020
bioRxiv DOI: 10.1101/2020.12.13.422570
Mutational signature analysis is commonly performed in genomic studies surveying cancer and normal somatic tissues. Here we present SigProfilerExtractor, an automated tool for accurate de novo extraction of mutational signatures for all types of somatic mutations. Benchmarking with a total of 33 distinct scenarios encompassing 1,106 simulated signatures operative in more than 200,000 synthetic genomes demonstrates that SigProfilerExtractor outperforms ten other tools across all datasets with and without noise. For simulations with 5% noise, reflecting high-quality genomic datasets, SigProfilerExtractor outperforms other approaches by elucidating between 20% and 50% more true positive signatures while yielding more than 5-fold less false positive signatures. Applying SigProfilerExtractor to 2,778 whole-genome sequenced cancers reveals three previously missed mutational signatures. Two of the signatures are confirmed in independent cohorts with one of these signatures associating with tobacco smoking. In summary, this report provides a reference tool for analysis of mutational signatures, a comprehensive benchmarking of bioinformatics tools for extracting mutational signatures, and several novel mutational signatures including a signature putatively attributed to direct tobacco smoking mutagenesis in bladder cancer and in normal bladder epithelium.
Download data
- Downloaded 511 times
- Download rankings, all-time:
- Site-wide: 51,889
- In bioinformatics: 5,303
- Year to date:
- Site-wide: 6,292
- Since beginning of last month:
- Site-wide: 6,830
Altmetric data
Downloads over time
Distribution of downloads per paper, site-wide
PanLingua
News
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!