Rxivist logo

Aquila: diploid personal genome assembly and comprehensive variant detection based on linked reads

By Xin Zhou, Lu Zhang, Ziming Weng, David L. Dill, Arend Sidow

Posted 05 Jun 2019
bioRxiv DOI: 10.1101/660605

Variant discovery in personal, whole genome sequence data is critical for uncovering the genetic contributions to health and disease. We introduce a new approach, Aquila, that uses linked-read data for generating a high quality diploid genome assembly, from which it then comprehensively detects and phases personal genetic variation. Assemblies cover >95% of the human reference genome, with over 98% in a diploid state. Thus, the assemblies support detection and accurate genotyping of the most prevalent types of human genetic variation, including single nucleotide polymorphisms (SNPs), small insertions and deletions (small indels), and structural variants (SVs), in all but the most difficult regions. All heterozygous variants are phased in blocks that can approach arm-level length. The final output of Aquila is a diploid and phased personal genome sequence, and a phased VCF file that also contains homozygous and a few unphased heterozygous variants. Aquila represents a cost-effective evolution of whole-genome reconstruction that can be applied to cohorts for variation discovery or association studies, or to single individuals with rare phenotypes that could be caused by SVs or compound heterozygosity.

Download data

  • Downloaded 690 times
  • Download rankings, all-time:
    • Site-wide: 40,009
    • In genomics: 3,283
  • Year to date:
    • Site-wide: 103,154
  • Since beginning of last month:
    • Site-wide: 134,891

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide