Rxivist logo

Fast and accurate long-range phasing in a UK Biobank cohort

By Po-Ru Loh, Pier Francesco Palamara, Alkes Price

Posted 04 Oct 2015
bioRxiv DOI: 10.1101/028282 (published DOI: 10.1038/ng.3571)

Recent work has leveraged the extensive genotyping of the Icelandic population to perform long-range phasing (LRP), enabling accurate imputation and association analysis of rare variants in target samples typed on genotyping arrays. Here, we develop a fast and accurate LRP method, Eagle, that extends this paradigm to populations with much smaller proportions of genotyped samples by harnessing long (>4cM) identical-by-descent (IBD) tracts shared among distantly related individuals. We applied Eagle to N=150K samples (0.2% of the British population) from the UK Biobank, and we determined that it is 1-2 orders of magnitude faster than existing methods while achieving similar or better phasing accuracy (switch error rate ≈0.3%, corresponding to perfect phase in most 10Mb segments). We also observed that when used within an imputation pipeline, Eagle pre-phasing improved downstream imputation accuracy compared to pre-phasing in batches using existing methods (as necessary to achieve comparable computational cost).

Download data

  • Downloaded 2,438 times
  • Download rankings, all-time:
    • Site-wide: 7,777
    • In genetics: 319
  • Year to date:
    • Site-wide: 92,192
  • Since beginning of last month:
    • Site-wide: 122,217

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide