Rxivist logo

A performance assessment of relatedness inference methods using genome-wide data from thousands of relatives

By Monica D Ramstetter, Thomas D. Dyer, Donna M. Lehman, Joanne E. Curran, Ravindranath Duggirala, John Blangero, Jason Mezey, Amy L. Williams

Posted 04 Feb 2017
bioRxiv DOI: 10.1101/106013 (published DOI: 10.1534/genetics.117.1122)

Inferring relatedness from genomic data is an essential component of genetic association studies, population genetics, forensics, and genealogy. While numerous methods exist for inferring relatedness, thorough evaluation of these approaches in real data has been lacking. Here, we report an assessment of 12 state-of-the-art pairwise relatedness inference methods using a dataset with 2,485 individuals contained in several large pedigrees that span up to six generations. We find that all methods have high accuracy (92%-99%) when detecting first and second degree relationships, but their accuracy dwindles to less than 43% for seventh degree relationships. However, most IBD segment-based methods inferred seventh degree relatives correct to within one relatedness degree for more than 76% of relative pairs. Overall, the most accurate methods were ERSA and approaches that compute total IBD sharing using the output from GERMLINE and Refined IBD to infer relatedness. Combining information from the most accurate methods provides little accuracy improvement, indicating that novel approaches--such as new methods that leverage relatedness signals from multiple samples--are needed to achieve a sizeable jump in performance.

Download data

  • Downloaded 1,209 times
  • Download rankings, all-time:
    • Site-wide: 18,863
    • In genetics: 867
  • Year to date:
    • Site-wide: 90,250
  • Since beginning of last month:
    • Site-wide: 93,765

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide