Rxivist logo

Dense And Accurate Whole-Chromosome Haplotyping Of Individual Genomes

By David Porubsky, Shilpa Garg, Ashley D. Sanders, Jan Korbel, Victor Guryev, Peter M. Lansdorp, Tobias Marschall

Posted 10 Apr 2017
bioRxiv DOI: 10.1101/126136 (published DOI: 10.1038/s41467-017-01389-4)

The diploid nature of the genome is neglected in many analyses done today, where a genome is perceived as a set of unphased variants with respect to a reference genome. Many important biological phenomena such as compound heterozygosity and epistatic effects between enhancers and target genes, however, can only be studied when haplotype-resolved genomes are available. This lack of haplotype-level analyses can be explained by a dearth of methods to produce dense and accurate chromosome-length haplotypes at reasonable costs. Here we introduce an integrative phasing strategy that combines global, but sparse haplotypes obtained from strand-specific single cell sequencing (Strand-seq) with dense, yet local, haplotype information available through long-read or linked-read sequencing. Our experiments provide comprehensive guidance on favorable combinations of Strand-seq libraries and sequencing coverages to obtain complete and genome-wide haplotypes of a single individual genome (NA12878) at manageable costs. We were able to reliably assign > 95% of alleles to their parental haplotypes using as few as 10 Strand-seq libraries in combination with 10-fold coverage PacBio data or, alternatively, 10X Genomics linked-read sequencing data. We conclude that the combination of Strand-seq with different sequencing technologies represents an attractive solution to chart the unique genetic variation of diploid genomes.

Download data

  • Downloaded 1,040 times
  • Download rankings, all-time:
    • Site-wide: 26,994
    • In bioinformatics: 2,995
  • Year to date:
    • Site-wide: 132,767
  • Since beginning of last month:
    • Site-wide: 130,868

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide