Rxivist logo

The relationship between transmission time and clustering methods in Mycobacterium tuberculosis epidemiology

By Conor J Meehan, Pieter Moris, Thomas A. Kohl, Jūlija Pečerska, Suriya Akter, Matthias Merker, Christian Utpatel, Patrick Beckert, Florian Gehre, Pauline Lempens, Tanja Stadler, Michel K Kaswa, Denise Kuhnert, Stefan Niemann, Bouke C. de Jong

Posted 16 Apr 2018
bioRxiv DOI: 10.1101/302232 (published DOI: 10.1016/j.ebiom.2018.10.013)

Background: Tracking recent transmission is a vital part of controlling widespread pathogens such as Mycobacterium tuberculosis. Multiple methods with specific performance characteristics exist for detecting recent transmission chains, usually by clustering strains based on genotype similarities. With such a large variety of methods available, informed selection of an appropriate approach for determining transmissions within a given setting/time period is difficult. Methods: This study combines whole genome sequence (WGS) data derived from 324 isolates collected 2005-2010 in Kinshasa, Democratic Republic of Congo (DRC), a high endemic setting, with phylodynamics to unveil the timing of transmission events posited by a variety of standard genotyping methods. Clustering data based on Spoligotyping, 24-loci MIRU-VNTR typing, WGS based SNP (Single Nucleotide Polymorphism) and core genome multi locus sequence typing (cgMLST) typing were evaluated. Findings: Our results suggest that clusters based on Spoligotyping could encompass transmission events that occurred over 70 years prior to sampling while 24-loci-MIRU-VNTR often represented two or more decades of transmission. Instead, WGS based genotyping applying low SNP or cgMLST allele thresholds allows for determination of recent transmission events in timespans of up to 10 years e.g. for a 5 SNP/allele cut-off. Interpretation: With the rapid uptake of WGS methods in surveillance and outbreak tracking, the findings obtained in this study can guide the selection of appropriate clustering methods for uncovering relevant transmission chains within a given time-period. For high resolution cluster analyses, WGS-SNP and cgMLST based analyses have similar clustering/timing characteristics even for data obtained from a high incidence setting.

Download data

  • Downloaded 1,072 times
  • Download rankings, all-time:
    • Site-wide: 23,786
    • In epidemiology: 1,427
  • Year to date:
    • Site-wide: 107,598
  • Since beginning of last month:
    • Site-wide: 144,502

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide