Telomere-to-telomere assembly of a complete human X chromosome
Karen H. Miga,
Mitchell R. Vollger,
Glennis A. Logsdon,
Valerie A Schneider,
Gerard G Bouffard,
Alexander M Chang,
Nancy F Hansen,
Anthony D Schmitt,
Megan Y. Dennis,
Daniela C Soto,
Nicholas J. Loman,
Rosa ana Risques,
Tina A. Graves Lindsay,
James C. Mullikin,
Pavel A. Pevzner,
Jennifer L. Gerton,
Beth A Sullivan,
E. E. Eichler,
Adam M. Phillippy
Posted 16 Aug 2019
bioRxiv DOI: 10.1101/735928 (published DOI: 10.1038/s41586-020-2547-7)
Posted 16 Aug 2019
After nearly two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no one chromosome has been finished end to end, and hundreds of unresolved gaps persist ,. The remaining gaps include ribosomal rDNA arrays, large near-identical segmental duplications, and satellite DNA arrays. These regions harbor largely unexplored variation of unknown consequence, and their absence from the current reference genome can lead to experimental artifacts and hide true variants when re-sequencing additional human genomes. Here we present a de novo human genome assembly that surpasses the continuity of GRCh38 , along with the first gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome , we reconstructed the ∼2.8 megabase centromeric satellite DNA array and closed all 29 remaining gaps in the current reference, including new sequence from the human pseudoautosomal regions and cancer-testis ampliconic gene families (CT-X and GAGE). This complete chromosome X, combined with the ultra-long nanopore data, also allowed us to map methylation patterns across complex tandem repeats and satellite arrays for the first time. These results demonstrate that finishing the human genome is now within reach and will enable ongoing efforts to complete the remaining human chromosomes. : #ref-1 : #ref-2 : #ref-3
- Downloaded 5,754 times
- Download rankings, all-time:
- Site-wide: 631 out of 94,912
- In bioinformatics: 99 out of 8,837
- Year to date:
- Site-wide: 586 out of 94,912
- Since beginning of last month:
- Site-wide: 155 out of 94,912
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!