Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 62,747 bioRxiv papers from 278,434 authors.

LinkedSV: Detection of mosaic structural variants from linked-read exome and genome sequencing data

By Li Fang, Charlly Kao, Michael V Gonzalez, Fernanda A Mafra, Renata Pellegrino da Silva, Mingyao Li, Sören Wenzel, Katharina Wimmer, Hakon Hakonarson, Kai Wang

Posted 06 Sep 2018
bioRxiv DOI: 10.1101/409789

Reliable detection of structural variants (SVs) from short-read sequencing remains challenging, partly due to the presence of repetitive DNA elements that are longer than typical short reads (~100-150bp). Linked-read sequencing provides long-range information from short-read sequencing data by linking reads originating from the same DNA molecule based on barcoding, and thus has the potential to improve the sensitivity of SV detection and accuracy of breakpoint identification for certain classes of SVs. We present LinkedSV (https://github.com/WGLab/LinkedSV), a novel SV detection algorithm which combines two types of statistical evidence. Simulation studies demonstrated that LinkedSV outperformed multiple existing computational tools, and it worked particularly well on exome sequencing data and on SVs with low variant allele frequencies due to somatic mosaicism. We further demonstrated two clinical cases where LinkedSV successfully identified disease causal SVs from linked-read exome sequencing data, yet other computational methods failed, suggesting that a fraction of negative cases from clinical exome sequencing may be due to hidden SVs undetectable by traditional methods. Finally, comparative analysis of a human genome deeply sequenced by PacBio long-read sequencing (103X) and linked-read sequencing (37X) demonstrated unique advantages of linked-read data to identify large SVs that are missed from high-coverage long-read data. In summary, our results support the use of linked-read sequencing to detect hidden SVs missed by conventional short-read sequencing approaches, and may help resolve negative cases from clinical genome or exome sequencing.

Download data

  • Downloaded 886 times
  • Download rankings, all-time:
    • Site-wide: 8,599 out of 62,747
    • In bioinformatics: 1,510 out of 6,251
  • Year to date:
    • Site-wide: 3,586 out of 62,747
  • Since beginning of last month:
    • Site-wide: 10,324 out of 62,747

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide

Sign up for the Rxivist weekly newsletter! (Click here for more details.)