Rxivist logo

Complete assembly of parental haplotypes with trio binning

By Sergey Koren, Arang Rhie, Brian P. Walenz, Alexander T Dilthey, Derek M. Bickhart, Sarah B. Kingan, Stefan Hiendleder, John L. Williams, Timothy P. L. Smith, Adam M. Phillippy

Posted 26 Feb 2018
bioRxiv DOI: 10.1101/271486 (published DOI: 10.1038/nbt.4277)

Reference genome projects have historically selected inbred individuals to minimize heterozygosity and simplify assembly. We challenge this dogma and present a new approach designed specifically for heterozygous genomes. "Trio binning" uses short reads from two parental genomes to partition long reads from an offspring into haplotype-specific sets prior to assembly. Each haplotype is then assembled independently, resulting in a complete diploid reconstruction. On a benchmark human trio, this method achieved high accuracy and recovered complex structural variants missed by alternative approaches. To demonstrate its effectiveness on a heterozygous genome, we sequenced an F1 cross between cattle subspecies Bos taurus taurus and Bos taurus indicus, and completely assembled both parental haplotypes with NG50 haplotig sizes >20 Mbp and 99.998% accuracy, surpassing the quality of current cattle reference genomes. We propose trio binning as a new best practice for diploid genome assembly that will enable new studies of haplotype variation and inheritance.

Download data

  • Downloaded 5,832 times
  • Download rankings, all-time:
    • Site-wide: 571 out of 89,828
    • In genomics: 118 out of 5,712
  • Year to date:
    • Site-wide: 6,368 out of 89,828
  • Since beginning of last month:
    • Site-wide: 8,801 out of 89,828

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)