Rxivist logo

Accommodating individual travel history, global mobility, and unsampled diversity in phylogeography: a SARS-CoV-2 case study.

By Philippe Lemey, Samuel Hong, Verity Hill, Guy Baele, Chiara Poletto, Vittoria Colizza, Áine O’Toole, John T. McCrone, Kristian G. Andersen, Michael Worobey, Martha I. Nelson, Andrew Rambaut, Marc A Suchard

Posted 23 Jun 2020
bioRxiv DOI: 10.1101/2020.06.22.165464

Spatiotemporal bias in genome sequence sampling can severely confound phylogeographic inference based on discrete trait ancestral reconstruction. This has impeded our ability to accurately track the emergence and spread of SARS-CoV-2, which is the virus responsible for the COVID-19 pandemic. Despite the availability of staggering numbers of genomes on a global scale, evolutionary reconstructions of SARS-CoV-2 are hindered by the slow accumulation of sequence divergence over its relatively short transmission history. When confronted with these issues, incorporating additional contextual data may critically inform phylodynamic reconstructions. Here, we present a new approach to integrate individual travel history data in Bayesian phylogeographic inference and apply it to the early spread of SARS-CoV-2, while also including global air transportation data. We demonstrate that including travel history data for each SARS-CoV-2 genome yields more realistic reconstructions of virus spread, particularly when travelers from undersampled locations are included to mitigate sampling bias. We further explore the impact of sampling bias by incorporating unsampled sequences from undersampled locations in the analyses. Our reconstructions reinforce specific transmission hypotheses suggested by the inclusion of travel history data, but also suggest alternative routes of virus migration that are plausible within the epidemiological context but are not apparent with current sampling efforts. Although further research is needed to fully examine the performance of our new data integration approaches and to further improve them, they represent multiple new avenues for directly addressing the colossal issue of sample bias in phylogeographic inference. ### Competing Interest Statement The authors have declared no competing interest.

Download data

  • Downloaded 1,649 times
  • Download rankings, all-time:
    • Site-wide: 12,597
    • In evolutionary biology: 380
  • Year to date:
    • Site-wide: 45,935
  • Since beginning of last month:
    • Site-wide: 58,295

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide