Rxivist logo

AFLAP: Assembly-Free Linkage Analysis Pipeline using k-mers from whole genome sequencing data

By Kyle Fletcher, Lin Zhang, Juliana Gil, Rongkui Han, Keri Cavanaugh, Richard Michelmore

Posted 15 Sep 2020
bioRxiv DOI: 10.1101/2020.09.14.296525

Background: Genetic maps are an important resource for validation of genome assemblies, trait discovery, and breeding. Next generation sequencing has enabled production of high-density genetic maps constructed with 10,000s of markers. Most current approaches require a genome assembly to identify markers. Our Assembly Free Linkage Analysis Pipeline (AFLAP) removes this requirement by using uniquely segregating k -mers as markers to rapidly construct a genotype table and perform subsequent linkage analysis. This avoids potential biases including preferential read alignment and variant calling. Results: The performance of AFLAP was determined in simulations and contrasted to a conventional workflow. We tested AFLAP using 100 F2 individuals of Arabidopsis thaliana , sequenced to low coverage. Genetic maps generated using k -mers contained over 130,000 markers that were concordant with the genomic assembly. The utility of AFLAP was then demonstrated by generating an accurate genetic map using genotyping-by-sequencing data of 235 recombinant inbred lines of Lactuca spp. AFLAP was then applied to 83 F1 individuals of the oomycete Bremia lactucae , sequenced to >5x coverage. The genetic map contained over 90,000 markers ordered in 19 large linkage groups. This genetic map was used to fragment, order, orient, and scaffold the genome, resulting in a much-improved reference assembly. Conclusions: AFLAP can be used to generate high density linkage maps and improve genome assemblies of any organism when a mapping population is available using whole genome sequencing or genotyping-by-sequencing data. Genetic maps produced for B. lactucae were accurately aligned to the genome and guided significant improvements of the reference assembly. ### Competing Interest Statement The authors have declared no competing interest.

Download data

  • Downloaded 347 times
  • Download rankings, all-time:
    • Site-wide: 84,955
    • In genetics: 3,741
  • Year to date:
    • Site-wide: 46,365
  • Since beginning of last month:
    • Site-wide: 118,472

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide