Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 73,536 bioRxiv papers from 320,015 authors.

The prevailing genome assembly paradigm is to produce consensus sequences that "collapse" parental haplotypes into a consensus sequence. Here, we leverage the chromosome-wide phasing and scaffolding capabilities of single-cell strand sequencing (Strand-seq) and combine them with high-fidelity (HiFi) long sequencing reads, in a novel reference-free workflow for diploid de novo genome assembly. Employing this strategy, we produce completely phased de novo genome assemblies separately for each haplotype of a single individual of Puerto Rican origin (HG00733) in the absence of parental data. The assemblies are accurate (QV > 40), highly contiguous (contig N50 > 25 Mbp) with low switch error rates (0.4%) providing fully phased single-nucleotide variants (SNVs), indels, and structural variants (SVs). A comparison of Oxford Nanopore and PacBio phased assemblies identifies 150 regions that are preferential sites of contig breaks irrespective of sequencing technology or phasing algorithms.

Download data

  • Downloaded 1,265 times
  • Download rankings, all-time:
    • Site-wide: 5,893 out of 73,536
    • In bioinformatics: 1,117 out of 7,162
  • Year to date:
    • Site-wide: 859 out of 73,536
  • Since beginning of last month:
    • Site-wide: 859 out of 73,536

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)