Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 73,413 bioRxiv papers from 319,549 authors.

Species abundance information improves sequence taxonomy classification accuracy

By Benjamin D. Kaehler, Nicholas A. Bokulich, Daniel McDonald, Rob Knight, J. Gregory Caporaso, Gavin A. Huttley

Posted 03 Sep 2018
bioRxiv DOI: 10.1101/406611 (published DOI: 10.1038/s41467-019-12669-6)

Popular naive Bayes taxonomic classifiers for amplicon sequences assume that all species in the reference database are equally likely to be observed. We demonstrate that classification accuracy degrades linearly with the degree to which that assumption is violated, and in practice it is always violated. By incorporating environment-specific taxonomic abundance information, we demonstrate that species-level resolution is attainable.

Download data

  • Downloaded 819 times
  • Download rankings, all-time:
    • Site-wide: 11,903 out of 73,307
    • In bioinformatics: 1,972 out of 7,149
  • Year to date:
    • Site-wide: 14,676 out of 73,307
  • Since beginning of last month:
    • Site-wide: 14,676 out of 73,307

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)