Rxivist logo

GLnexus: joint variant calling for large cohort sequencing

By Michael F Lin, Ohad Rodeh, John Penn, Xiaodong Bai, Jeffrey G. Reid, Olga Krasheninina, William J Salerno

Posted 11 Jun 2018
bioRxiv DOI: 10.1101/343970

As ever-larger cohorts of human genomes are collected in pursuit of genotype/phenotype associations, sequencing informatics must scale up to yield complete and accurate genotypes from vast raw datasets. Joint variant calling, a data processing step entailing simultaneous analysis of all participants sequenced, exhibits this scaling challenge acutely. We present GLnexus (GL, Genotype Likelihood), a system for joint variant calling designed to scale up to the largest foreseeable human cohorts. GLnexus combines scalable joint calling algorithms with a persistent database that grows efficiently as additional participants are sequenced. We validate GLnexus using 50,000 exomes to show it produces comparable or better results than existing methods, at a fraction of the computational cost with better scaling. We provide a standalone open-source version of GLnexus and a DNAnexus cloud-native deployment supporting very large projects, which has been employed for cohorts of >240,000 exomes and >22,000 whole-genomes.

Download data

  • Downloaded 6,715 times
  • Download rankings, all-time:
    • Site-wide: 2,289
    • In bioinformatics: 147
  • Year to date:
    • Site-wide: 2,338
  • Since beginning of last month:
    • Site-wide: 2,950

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide