Rxivist logo

Reference-based phasing using the Haplotype Reference Consortium panel

By Po-Ru Loh, Petr Danecek, Pier Francesco Palamara, Christian Fuchsberger, Yakir A. Reshef, Hilary K Finucane, Sebastian Schoenherr, Lukas Forer, Shane McCarthy, Goncalo R. Abecasis, Richard Durbin, Alkes Price

Posted 10 May 2016
bioRxiv DOI: 10.1101/052308 (published DOI: 10.1038/ng.3679)

Haplotype phasing is a fundamental problem in medical and population genetics. Phasing is generally performed via statistical phasing within a genotyped cohort, an approach that can attain high accuracy in very large cohorts but attains lower accuracy in smaller cohorts. Here, we instead explore the paradigm of reference-based phasing. We introduce a new phasing algorithm, Eagle2, that attains high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels (such as the Haplotype Reference Consortium, HRC) using a new data structure based on the positional Burrows-Wheeler transform. We demonstrate that Eagle2 attains a ≈20x speedup and ≈10% increase in accuracy compared to reference-based phasing using SHAPEIT2. On European-ancestry samples, Eagle2 with the HRC panel achieves >2x the accuracy of 1000 Genomes-based phasing. Eagle2 is open source and freely available for HRC-based phasing via the Sanger Imputation Service and the Michigan Imputation Server.

Download data

  • Downloaded 1,965 times
  • Download rankings, all-time:
    • Site-wide: 10,710
    • In genetics: 433
  • Year to date:
    • Site-wide: 149,768
  • Since beginning of last month:
    • Site-wide: 97,098

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide