Rxivist logo

SNVPhyl: A Single Nucleotide Variant Phylogenomics pipeline for microbial genomic epidemiology

By Aaron Petkau, Philip Mabon, Cameron Sieffert, Natalie Knox, Jennifer Cabral, Mariam Iskander, Mark Iskander, Kelly Weedmark, Rahat Zaheer, Lee S. Katz, Celine Nadon, Aleisha Reimer, Eduardo Taboada, Robert G. Beiko, William Hsiao, Fiona Brinkman, Morag Graham, the IRIDA Consortium, Gary Van Domselaar

Posted 09 Dec 2016
bioRxiv DOI: 10.1101/092940 (published DOI: 10.1099/mgen.0.000116)

Motivation: The recent widespread application of whole-genome sequencing (WGS) for microbial disease investigations has spurred the development of new bioinformatics tools, including a notable proliferation of phylogenomics pipelines designed for infectious disease surveillance and outbreak investigation. Transitioning the use of WGS data out of the research lab and into the front lines of surveillance and outbreak response requires user-friendly, reproducible, and scalable pipelines that have been well validated. Results: SNVPhyl (Single Nucleotide Variant Phylogenomics) is a bioinformatics pipeline for identifying high-quality SNVs and constructing a whole genome phylogeny from a collection of WGS reads and a reference genome. Individual pipeline components are integrated into the Galaxy bioinformatics framework, enabling data analysis in a user-friendly, reproducible, and scalable environment. We show that SNVPhyl can detect SNVs with high sensitivity and specificity and identify and remove regions of high SNV density (indicative of recombination). SNVPhyl is able to correctly distinguish outbreak from non-outbreak isolates across a range of variant-calling settings, sequencing-coverage thresholds, or in the presence of contamination. Availability: SNVPhyl is available as a Galaxy workflow, Docker and virtual machine images, and a Unix-based command-line application. SNVPhyl is released under the Apache 2.0 license and available at http://snvphyl.readthedocs.io/ or at https://github.com/phac-nml/snvphyl-galaxy.

Download data

  • Downloaded 992 times
  • Download rankings, all-time:
    • Site-wide: 12,067 out of 92,330
    • In bioinformatics: 1,930 out of 8,659
  • Year to date:
    • Site-wide: 67,521 out of 92,330
  • Since beginning of last month:
    • Site-wide: 51,131 out of 92,330

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)