Rxivist logo

An evaluation of pool-sequencing transcriptome-based exon capture for population genomics in non-model species

By Emeline Deleury, Thomas Guillemaud, Aurélie Blin, Eric Lombaert

Posted 20 Mar 2019
bioRxiv DOI: 10.1101/583534

Exon capture coupled to high-throughput sequencing constitutes a cost-effective technical solution for addressing specific questions in evolutionary biology by focusing on expressed regions of the genome preferentially targeted by selection. Transcriptome-based capture, a process that can be used to capture the exons of non-model species, is use in phylogenomics. However, its use in population genomics remains rare due to the high costs of sequencing large numbers of indexed individuals across multiple populations. We evaluated the feasibility of combining transcriptome-based capture and the pooling of tissues from numerous individuals for DNA extraction as a cost-effective, generic and robust approach to estimating the variant allele frequencies of any species at the population level. We designed capture probes for ∼5 Mb of chosen de novo transcripts from the Asian ladybird Harmonia axyridis (5,717 transcripts). We called ∼300,000 bi-allelic SNPs for a pool of 36 non-indexed individuals. Capture efficiency was high, and pool-seq was as effective and accurate as individual-seq for detecting variants and estimating allele frequencies. Finally, we also evaluated an approach for simplifying bioinformatic analyses by mapping genomic reads directly to targeted transcript sequences to obtain coding variants. This approach is effective and does not affect the estimation of SNP allele frequencies, except for a small bias close to some exon ends. We demonstrate that this approach can also be used to predict the intron-exon boundaries of targeted de novo transcripts, making it possible to abolish genotyping biases near exon ends. ### Competing Interest Statement The authors have declared no competing interest.

Download data

  • Downloaded 1,401 times
  • Download rankings, all-time:
    • Site-wide: 22,705
    • In genomics: 1,936
  • Year to date:
    • Site-wide: 78,203
  • Since beginning of last month:
    • Site-wide: 60,669

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide