Highly-accurate long-read sequencing improves variant detection and assembly of a human genome
Aaron M. Wenger,
William J. Rowell,
Richard J Hall,
Gregory T. Concepcion,
Nathanael D Olson,
Adam M. Phillippy,
Michael C. Schatz,
Mark A. DePristo,
Fritz J Sedlazeck,
Justin M. Zook,
David R. Rank,
Michael W Hunkapiller
Posted 13 Jan 2019
bioRxiv DOI: 10.1101/519025 (published DOI: 10.1038/s41587-019-0217-9)
Posted 13 Jan 2019
The major DNA sequencing technologies in use today produce either highly-accurate short reads or noisy long reads. We developed a protocol based on single-molecule, circular consensus sequencing (CCS) to generate highly-accurate (99.8%) long reads averaging 13.5 kb and applied it to sequence the well-characterized human HG002/NA24385. We optimized existing tools to comprehensively detect variants, achieving precision and recall above 99.91% for SNVs, 95.98% for indels, and 95.99% for structural variants. We estimate that 2,434 discordances are correctable mistakes in the high-quality Genome in a Bottle benchmark. Nearly all (99.64%) variants are phased into haplotypes, which further improves variant detection. De novo assembly produces a highly contiguous and accurate genome with contig N50 above 15 Mb and concordance of 99.998%. CCS reads match short reads for small variant detection, while enabling structural variant detection and de novo assembly at similar contiguity and markedly higher concordance than noisy long reads.
- Downloaded 11,849 times
- Download rankings, all-time:
- Site-wide: 181 out of 89,763
- In genomics: 31 out of 5,711
- Year to date:
- Site-wide: 907 out of 89,763
- Since beginning of last month:
- Site-wide: 1,618 out of 89,763
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!