Rxivist logo

Rapid Automated Large Structural Variation Detection in a Diploid Genome by NanoChannel Based Next-Generation Mapping

By Alex R. Hastie, Ernest T Lam, Andy Wing Chun Pang, Xinyue Zhang, Warren Andrews, Joyce Lee, Tiffany Y Liang, Jian Wang, Xiang Zhou, Zhanyang Zhu, Thomas Anantharaman, Željko Džakula, Sven Bocklandt, Urvashi Surti, Michael Saghbini, Michael D. Austin, Mark Borodkin, R. Erik Holmlin, Han Cao

Posted 01 Feb 2017
bioRxiv DOI: 10.1101/102764

The human genome is diploid with one haploid genome inherited from the maternal and one from the paternal lineage. Within each haploid genome, large structural variants such as deletions, duplications, inversions, and translocations are extensively present and many are known to affect biological functions and cause disease. The ultimate goal is to resolve these large complex structural variants (SVs) and place them in the correct haploid genome with correct location, orientation, and copy number. Current methods such as karyotyping, chromosomal microarray (CMA), PCR-based tests, and next-generation sequencing fail to reach this goal either due to limited resolution, low throughput, or short read length. Bionano Genomics' next-generation mapping (NGM) offers a high-throughput, genome-wide method able to detect SVs of one kilobase pairs (kbp) and up. By imaging extremely long genomic molecules of up to megabases in size, the structure and copy number of complex regions of the genome including interspersed and long tandem repeats can be elucidated in their native form without inference. Here we tested Bionano's SV high sensitivity discovery algorithm, Bionano Solve 3.0, on in silico generated diploid genomes with artificially incorporated SVs based on the reference genome, hg19, achieving over 90% overall detection sensitivity for heterozygous SVs larger than 1 kbp. Next, in order to benchmark large SV detection sensitivity and accuracy on real biological data, we used Bionano NGM to map two naturally occurring hydatidiform mole cell lines, CHM1 and CHM13, each containing a different duplicated haploid genome. By de novo assembling each of two mole's genomes separately, followed by assembling a mixture of CHM1 and CHM13 data, we were able to measure heterozygous SV sensitivity by comparing SVs called in the mixture assembly against those called in the individual assemblies. We called 1999 unique SVs (> 1.5 kbp) in the pseudo-diploid assembly and established 87.4% sensitivity for detection of heterozygous SVs and 99.2% sensitivity for homozygous SVs. In comparison, a recent SV study on the same CHM1/CHM13 samples using long read NGS alone showed 54% sensitivity for detection of heterozygous SVs and 77.9% for homozygous SVs larger than 1.5 kbp. We also compared an SV call set of the diploid cell line NA12878 with the results of an earlier mapping study (Mak AC, 2016) and found concordance with 89% of the detected SVs found in the previous study and, in addition, 2599 novel SVs were detected. Finally, two pathogenic SVs were found in cell lines from individuals with developmental disorders. De novo comprehensive SV discovery by Bionano NGM is shown to be a fast, inexpensive, and robust method, now with an automated informatics workflow.

Download data

  • Downloaded 2,034 times
  • Download rankings, all-time:
    • Site-wide: 8,537
    • In genomics: 880
  • Year to date:
    • Site-wide: 66,044
  • Since beginning of last month:
    • Site-wide: None

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide