Multi-platform discovery of haplotype-resolved structural variation in human genomes
By
Mark Chaisson,
Ashley D. Sanders,
Xuefang Zhao,
Ankit Malhotra,
David Porubsky,
Tobias Rausch,
Eugene J Gardner,
Oscar Rodriguez,
Li Guo,
Ryan L. Collins,
Xian Fan,
Jia Wen,
Robert E Handsaker,
Susan Fairley,
Zev N Kronenberg,
Xiangmeng Kong,
Fereydoun Hormozdiari,
Dillon Lee,
Aaron M. Wenger,
Alex Hastie,
Danny Antaki,
Peter Audano,
Harrison Brand,
Stuart Cantsilieris,
Han Cao,
Eliza Cerveira,
Chong Chen,
Xintong Chen,
Chen-Shan Chin,
Zechen Chong,
Nelson T. Chuang,
Christine C. Lambert,
Deanna M. Church,
Laura Clarke,
Andrew Farrell,
Joey Flores,
Timur Galeev,
David Gorkin,
Madhusudan Gujral,
Victor Guryev,
William Haynes Heaton,
Jonas Korlach,
Sushant Kumar,
Jee Young Kwon,
Jong Eun Lee,
Joyce Lee,
Wan-Ping Lee,
Sau Peng Lee,
Shantao Li,
Patrick Marks,
Karine Viaud-Martinez,
Sascha Meiers,
Katherine M. Munson,
Fabio Navarro,
Bradley J Nelson,
Conor Nodzak,
Amina Noor,
Sofia Kyriazopoulou-Panagiotopoulou,
Andy Pang,
Yunjiang Qiu,
Gabriel Rosanio,
Mallory Ryan,
Adrian Stütz,
Diana C. J. Spierings,
Alistair Ward,
AnneMarie E. Welch,
Ming Xiao,
Wei Xu,
Chengsheng Zhang,
Qihui Zhu,
Xiangqun Zheng-Bradley,
Ernesto Lowy,
Sergei Yakneen,
Steven McCarroll,
Goo Jun,
Li Ding,
Chong Lek Koh,
Bing Ren,
Paul Flicek,
Ken Chen,
Mark Gerstein,
Pui-Yan Kwok,
Peter M. Lansdorp,
Gabor Marth,
Jonathan Sebat,
Xinghua Shi,
Ali Bashir,
Kai Ye,
Scott E. Devine,
Michael Talkowski,
Ryan E. Mills,
Tobias Marschall,
Jan Korbel,
Evan E Eichler,
Charles Lee
Posted 23 Sep 2017
bioRxiv DOI: 10.1101/193144
(published DOI: 10.1038/s41467-018-08148-z)
The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, and strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three human parent-child trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per human genome. We also discover 156 inversions per genome - most of which previously escaped detection. Fifty-eight of the inversions we discovered intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The method and the dataset serve as a gold standard for the scientific community and we make specific recommendations for maximizing structural variation sensitivity for future large-scale genome sequencing studies.
Download data
- Downloaded 8,569 times
- Download rankings, all-time:
- Site-wide: 1,549
- In genomics: 104
- Year to date:
- Site-wide: 67,057
- Since beginning of last month:
- Site-wide: 66,925
Altmetric data
Downloads over time
Distribution of downloads per paper, site-wide
PanLingua
News
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!