Bacterial symbionts that manipulate the reproduction of their hosts are important factors in invertebrate ecology and evolution. Studying the genomic and phenotypic diversity of reproductive manipulators can improve efforts to control infectious diseases and contribute to our understanding of host-symbiont evolution. Despite the vast genomic and phenotypic diversity of reproductive manipulators, only a handful of strains are used as biological control agents because little is known about the broad scale infection frequencies and densities of these bacteria in nature. Here we develop a data mining approach to quantify the number of arthropod and nematode host species infected with Wolbachia and other reproductive manipulators such as Rickettsia and Spiroplasma . Across the entire Sequence Read Archive (SRA) database, we found reproductive manipulators infected 2,083 arthropod and 119 nematode samples, representing 240 and 8 species, respectively. After accounting for sampling and infection frequency differences among species, we estimated that Wolbachia infects approximately 44% of all arthropod species and 34% of all nematode species. In contrast, we estimated other reproductive manipulators infect 1-8% of arthropod and nematode species. Next, we explored another important biological parameter: the relative bacterial density, or titer, within hosts. We found variation in titer within and between arthropod species to be large, and that host species explains approximately 36% of variation in titer across our dataset. This suggests bacterial strain and/or host species plays a role in shaping bacterial densities within and between host species. By leveraging the model system Drosophila melanogaster , we also found a number of host SNPs associated with titer in genes potentially relevant to host interactions with Wolbachia , suggesting bacterial induced host genome evolution. Our study demonstrates that data mining is a powerful tool to understand host-symbiont co-evolution and opens an array of previously inaccessible questions for further analysis. ### Competing Interest Statement The authors have declared no competing interest.
- Downloaded 579 times
- Download rankings, all-time:
- Site-wide: 63,378
- In evolutionary biology: 3,061
- Year to date:
- Site-wide: None
- Since beginning of last month:
- Site-wide: 96,496
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!