Semi-quantitative characterisation of mixed pollen samples using MinION sequencing and Reverse Metagenomics (RevMet)
Lynn V Dicks,
Matthew D Clark,
Richard G. Davies,
Richard M Leggett,
Douglas W. Yu
Posted 15 Feb 2019
bioRxiv DOI: 10.1101/551960 (published DOI: 10.1111/2041-210X.13265)
Posted 15 Feb 2019
The ability to identify and quantify the constituent plant species that make up a mixed-species sample of pollen has important applications in ecology, conservation, and agriculture. Recently, metabarcoding protocols have been developed for pollen that can identify constituent plant species, but there are strong reasons to doubt that metabarcoding can accurately quantify their relative abundances. A PCR-free, shotgun metagenomics approach has greater potential for accurately quantifying species relative abundances, but applying metagenomics to eukaryotes is challenging due to low numbers of reference genomes. We have developed a pipeline, RevMet (Reverse Metagenomics), that allows reliable and semi-quantitative characterization of the species composition of mixed-species eukaryote samples, such as bee-collected pollen, without requiring reference genomes. Instead, reference species are represented only by 'genome skims': low-cost, low-coverage, shortread sequence datasets. The skims are mapped to individual long reads sequenced from mixed-species samples using the MinION, a portable nanopore sequencing device, and each long read is uniquely assigned to a plant species. We genome-skimmed 49 wild UK plant species, validated our pipeline with mock DNA mixtures of known composition, and then applied RevMet to pollen loads collected from wild bees. We demonstrate that RevMet can identify plant species present in mixed-species samples at proportions of DNA >1%, with few false positives and false negatives, and reliably differentiate species represented by high versus low amounts of DNA in a sample. The RevMet pipeline could readily be adapted to generate semi-quantitative datasets for a wide range of mixed eukaryote samples, which could include characterising diets, quantifying allergenic pollen from air samples, quantifying soil fauna, and identifying the compositions of algal and diatom communities. Our per-sample costs were GBP 90 per genome skim and GBP 60 per pollen sample, and new versions of sequencers available now will further reduce these costs.
- Downloaded 1,321 times
- Download rankings, all-time:
- Site-wide: 17,583
- In ecology: 267
- Year to date:
- Site-wide: 82,010
- Since beginning of last month:
- Site-wide: 86,918
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!