Detection of cross-contamination and strong mitonuclear discordance in two species groups of sawfly genus Empria (Hymenoptera, Tenthredinidae)
In several sawfly taxa strong mitonuclear discordance has been observed, with nuclear genes supporting species assignments based on morphology, whereas the barcode region of the mitochondrial COI gene suggesting different relationships. As previous studies were based on only few nuclear genes, the causes and the degree of mitonuclear discordance remain ambiguous. Here, we obtain genomic-scale ddRAD data together with Sanger sequencing of mitochondrial COI and two to three nuclear protein coding genes to investigate species limits and mitonuclear discordance in two closely related species groups within the sawfly genus Empria. As found previously based on nuclear ITS and mitochondrial COI sequences, species are in most cases supported as monophyletic based on previous and new nuclear data reported here, but not based on mitochondrial COI. This mitonuclear discordance can be explained by occasional mitochondrial introgression with little or no nuclear gene flow, a pattern that might be common in haplodiploid taxa with slowly evolving mitochondrial genomes. Some species in E. immersa group are not recovered as monophyletic also based on nuclear data, but this could partly be because of unresolved taxonomy. Preliminary analyses of ddRAD data did not recover monophyly of E. japonica within E. longicornis group (three Sanger sequenced nuclear genes strongly supported monophyly), but closer examination of the data and additional Sanger sequencing suggested that both specimens were substantially (possibly 10-20% of recovered loci) cross-contaminated. A reason could be due to specimen identification tag jumps during sequencing library preparation of pooled specimens that in previous studies have been shown to affect up to 2.5% of the sequenced reads. We provide an R script to examine patterns of identical loci among the specimens and estimate that cross-contamination rate is not unusually high for our ddRAD dataset as a whole (based on counting identical sequences between immersa and longicornis groups that are well separated from each other and probably do not hybridise). The high rate of cross-contamination for both E. japonica specimens might be explained by small number of recovered loci (~1000) compared to most other specimens (>10 000 in some cases) because of poor sequencing results. We caution drawing unexpected biological conclusions when closely related specimens are pooled before sequencing and tagged only at one end of the molecule or at both ends using unique combination of limited number of tags (less than the number of specimens).
- Downloaded 342 times
- Download rankings, all-time:
- Site-wide: 78,735
- In evolutionary biology: 4,286
- Year to date:
- Site-wide: 132,762
- Since beginning of last month:
- Site-wide: 125,666
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!