Rapid and accurate SNP genotyping of clonal bacterial pathogens with BioHansel
Marisa A. Rankin,
Chad R. Laing,
Roger P. Johnson,
Gary Van Domselaar,
John H.E. Nash
Posted 11 Jan 2020
bioRxiv DOI: 10.1101/2020.01.10.902056
Posted 11 Jan 2020
BioHansel performs high-resolution genotyping of bacterial isolates by identifying phylogenetically informative single nucleotide polymorphisms (SNPs), also known as canonical SNPs, in whole genome sequencing (WGS) data. The application uses a fast k-mer matching algorithm to map pathogen WGS data to canonical SNPs contained in hierarchically structured schemas and assigns genotypes based on the detected SNP profile. Using modest computing resources, BioHansel efficiently types isolates from raw sequence reads or assembled contigs in a matter of seconds, making it attractive for use by public health, food safety, environmental, and agricultural authorities that wish to apply WGS methodologies for their surveillance, diagnostics, and research programs. BioHansel currently provides canonical SNP genotyping schemas for four prevalent Salmonella serovars: Typhi, Typhimurium, Enteritidis and Heidelberg, as well as a schema for Mycobacterium tuberculosis. Users can also supply their own schemas for genotyping other organisms. Its quality assurance system assesses the validity of the genotyping results and can identify low quality data, contaminated datasets, and misidentified organisms. BioHansel is targeted to support surveillance, source attribution, risk assessment, diagnostics, and rapid screening for public health purposes, such as product recalls. BioHansel is an open source application with packages available for PyPI, Conda, and the Galaxy workflow manager. In summary, BioHansel performs efficient, rapid, accurate, and high-resolution classification of bacterial genomes from sequence reads or assembled contigs on standard computing hardware. BioHansel is suitable for use as a general research tool as well as in fully operationalized WGS workflows at the front lines of infectious disease surveillance, diagnostics, and outbreak investigation and response.
- Downloaded 470 times
- Download rankings, all-time:
- Site-wide: 35,404 out of 93,322
- In bioinformatics: 4,444 out of 8,742
- Year to date:
- Site-wide: 5,526 out of 93,322
- Since beginning of last month:
- Site-wide: 15,994 out of 93,322
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!