Rxivist logo

Significantly improving the quality of genome assemblies through curation

By Kerstin Howe, William Chow, Joanna Collins, Sarah Pelan, Damon-Lee Pointon, Ying Sims, James Torrance, Alan Tracey, Jonathan Wood

Posted 13 Aug 2020
bioRxiv DOI: 10.1101/2020.08.12.247734

Background Genome sequence assemblies provide the basis for our understanding of biology. Generating error-free assemblies is therefore the ultimate, but sadly still unachieved goal of a multitude of research projects. Despite the ever-advancing improvements in data generation, assembly algorithms and pipelines, no automated approach has so far reliably generated near error-free genome assemblies for eukaryotes. Results Whilst working towards improved data sets and fully automated pipelines, assembly evaluation and curation is actively employed to bridge this shortcoming and significantly reduce the number of assembly errors. In addition to this increase in product value, the insights gained from assembly curation are fed back into the automated assembly strategy and contribute to notable improvements in genome assembly quality. Conclusions We describe our tried and tested approach for assembly curation using gEVAL, the genome evaluation browser. We outline the procedures applied to genome curation using gEVAL and also our recommendations for assembly curation in an gEVAL-independent context to facilitate the uptake of genome curation in the wider community. ### Competing Interest Statement The authors have declared no competing interest.

Download data

  • Downloaded 626 times
  • Download rankings, all-time:
    • Site-wide: 57,859
    • In bioinformatics: 5,678
  • Year to date:
    • Site-wide: 103,432
  • Since beginning of last month:
    • Site-wide: 140,583

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide