Rxivist logo

Integrating Hi-C links with assembly graphs for chromosome-scale assembly

By Jay Ghurye, Arang Rhie, Brian P. Walenz, Anthony Schmitt, Siddarth Selvaraj, Mihai Pop, Adam M Phillippy, Sergey Koren

Posted 07 Feb 2018
bioRxiv DOI: 10.1101/261149 (published DOI: 10.1371/journal.pcbi.1007273)

Motivation: Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C sequencing is becoming an economical method for generating chromosome-scale scaffolds. Despite its increasing popularity, there are limited open-source tools available. Errors, particularly inversions and fusions across chromosomes, remain higher than alternate scaffolding technologies. Results: We present a novel open-source Hi-C scaffolder that does not require an a priori estimate of chromosome number and minimizes errors by scaffolding with the assistance of an assembly graph. We demonstrate higher accuracy than the state-of-the-art methods across a variety of Hi-C library preparations and input assembly sizes. Availability and Implementation: The Python and C++ code for our method is openly available at https://github.com/machinegun/SALSA.

Download data

  • Downloaded 6,417 times
  • Download rankings, all-time:
    • Site-wide: 496 out of 89,715
    • In bioinformatics: 76 out of 8,461
  • Year to date:
    • Site-wide: 11,283 out of 89,715
  • Since beginning of last month:
    • Site-wide: 17,851 out of 89,715

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)