Rxivist logo

New synthetic-diploid benchmark for accurate variant calling evaluation

By Heng Li, Jonathan M Bloom, Yossi Farjoun, Mark Fleharty, Laura Gauthier, Benjamin M Neale, Daniel G. MacArthur

Posted 22 Nov 2017
bioRxiv DOI: 10.1101/223297 (published DOI: 10.1038/s41592-018-0054-7)

Constructed from the consensus of multiple variant callers based on short-read data, existing benchmark datasets for evaluating variant calling accuracy are biased toward easy regions accessible by known algorithms. We derived a new benchmark dataset from the de novo PacBio assemblies of two human cell lines that are homozygous across the whole genome. This benchmark provides a more accurate and less biased estimate of the error rate of small variant calls in a realistic context.

Download data

  • Downloaded 3,387 times
  • Download rankings, all-time:
    • Site-wide: 6,136
    • In bioinformatics: 559
  • Year to date:
    • Site-wide: 117,929
  • Since beginning of last month:
    • Site-wide: 147,509

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide