Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 67,125 bioRxiv papers from 295,505 authors.

NextSV: a meta-caller for structural variants from low-coverage long-read sequencing data

By Li Fang, Jiang Hu, Depeng Wang, Kai Wang

Posted 09 Dec 2016
bioRxiv DOI: 10.1101/092544 (published DOI: 10.1186/s12859-018-2207-1)

Background: Structural variants (SVs) in human genomes are implicated in a variety of human diseases. Long-read sequencing delivers much longer read lengths than short-read sequencing and may greatly improve SV detection. However, due to the relatively high cost of long-read sequencing, it is unclear what coverage is needed and how to optimally use the aligners and SV callers. Results: In this study, we developed NextSV, a meta-caller to perform SV calling from low coverage long-read sequencing data. NextSV integrates three aligners and three SV callers and generates two integrated call sets (sensitive/stringent) for different analysis purposes. We evaluated SV calling performance of NextSV under different PacBio coverages on two personal genomes, NA12878 and HX1. Our results showed that, compared with running any single SV caller, NextSV stringent call set had higher precision and balanced accuracy (F1 score) while NextSV sensitive call set had a higher recall. At 10X coverage, the recall of NextSV sensitive call set was 93.5% to 94.1% for deletions and 87.9% to 93.2% for insertions, indicating that ~10X coverage might be an optimal coverage to use in practice, considering the balance between the sequencing costs and the recall rates. We further evaluated the Mendelian errors on an Ashkenazi Jewish trio dataset. Conclusions: Our results provide useful guidelines for SV detection from low coverage whole-genome PacBio data and we expect that NextSV will facilitate the analysis of SVs on long-read sequencing data.

Download data

  • Downloaded 2,185 times
  • Download rankings, all-time:
    • Site-wide: 2,083 out of 67,132
    • In genomics: 448 out of 4,561
  • Year to date:
    • Site-wide: 22,709 out of 67,132
  • Since beginning of last month:
    • Site-wide: 31,210 out of 67,132

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide

Sign up for the Rxivist weekly newsletter! (Click here for more details.)