Rxivist logo

Estimation Of Genomic Prediction Accuracy From Reference Populations With Varying Degrees Of Relationship

By Sang Hong Lee, Sam Clark, Julius H.J. van der Werf

Posted 22 Mar 2017
bioRxiv DOI: 10.1101/119164 (published DOI: 10.1371/journal.pone.0189775)

Genomic prediction is emerging in a wide range of fields including animal and plant breeding, risk prediction in human precision medicine and forensic. It is desirable to establish a theoretical framework for genomic prediction accuracy when the reference data consists of information sources with varying degrees of relationship to the target individuals. A reference set can contain both close and distant relatives as well as 'unrelated' individuals from the wider population in the genomic prediction. The various sources of information were modeled as different populations with different effective population sizes (Ne). Both the effective number of chromosome segments (Me) and Ne are considered to be a function of the data used for prediction. We validate our theory with analyses of simulated as well as real data, and illustrate that the variation in genomic relationships with the target is a predictor of the information content of the reference set. With a similar amount of data available for each source, we show that close relatives can have a substantially larger effect on genomic prediction accuracy than lesser related individuals. We also illustrate that when prediction relies on closer relatives, there is less improvement in prediction accuracy with an increase in training data or marker panel density. We release software that can estimate the expected prediction accuracy and power when combining different reference sources with various degrees of relationship to the target, which is useful when planning genomic prediction (before or after collecting data) in animal, plant and human genetics.

Download data

  • Downloaded 819 times
  • Download rankings, all-time:
    • Site-wide: 35,350
    • In genetics: 1,581
  • Year to date:
    • Site-wide: 132,238
  • Since beginning of last month:
    • Site-wide: 124,720

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide