Rxivist logo

Genome-wide association, prediction and heritability in bacteria

By Sudaraka Mallawaarachchi, Gerry Tonkin-Hill, Nicholas J Croucher, Paul Turner, Doug Speed, Jukka Corander, David Balding

Posted 04 Oct 2021
bioRxiv DOI: 10.1101/2021.10.04.462983

Advances in whole-genome genotyping and sequencing have allowed genome-wide analyses of association, prediction and heritability in many organisms. However, the application of such analyses to bacteria is still in its infancy, being limited by difficulties including the plasticity of bacterial genomes and their strong population structure. Here we propose, and validate using simulations, a suite of genome-wide analyses for bacteria. We combine methods from human genetics and previous bacterial studies, including linear mixed models, elastic net and LD-score regression, and introduce innovations such as frequency-based allele coding, testing for both insertion/deletion and nucleotide effects and partitioning heritability by genome region. We then analyse three phenotypes of a major human pathogen Streptococcus pneumoniae, including the first analyses of minimum inhibitory concentrations (MIC) for each of two antibiotics, penicillin and ceftriaxone. We show that these are highly heritable leading to high prediction accuracy, which is explained by many genetic associations identified under good control of population structure effects. In the case of ceftriaxone MIC, these results are surprising because none of the isolates was resistant according to the inhibition zone diameter threshold. We estimate that just over half of the heritability of penicillin MIC is explained by a known drug-resistance region, which also contributes around a quarter of the heritability of ceftriaxone MIC. For the within-host survival phenotype carriage duration, no reliable associations were found but we observed moderate heritability and prediction accuracy, indicating a polygenic trait. While generating important new results for S. pneumoniae, we have critically assessed existing methods and introduced innovations that will be useful for future large-scale population genomics studies to help decipher the genetic architecture of bacterial traits.

Download data

  • Downloaded 323 times
  • Download rankings, all-time:
    • Site-wide: 115,007
    • In genomics: 6,553
  • Year to date:
    • Site-wide: 15,409
  • Since beginning of last month:
    • Site-wide: 21,626

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide