Rxivist logo

Bayesian Nonparametric Inference of Population Size Changes from Sequential Genealogies

By Julia A Palacios, John Wakeley, Sohini Ramachandran

Posted 11 May 2015
bioRxiv DOI: 10.1101/019216 (published DOI: 10.1534/genetics.115.177980)

Sophisticated inferential tools coupled with the coalescent model have recently emerged for estimating past population sizes from genomic data. Accurate methods are available for data from a single locus or from independent loci. Recent methods that model recombination require small sample sizes, make constraining assumptions about population size changes, and do not report measures of uncertainty for estimates. Here, we develop a Gaussian process-based Bayesian nonparametric method coupled with a sequentially Markov coalescent model which allows accurate inference of population sizes over time from a set of genealogies. In contrast to current methods, our approach considers a broad class of recombination events, including those that do not change local genealogies. We show that our method outperforms recent likelihood-based methods that rely on discretization of the parameter space. We illustrate the application of our method to multiple demographic histories, including population bottlenecks and exponential growth. In simulation, our Bayesian approach produces point estimates four times more accurate than maximum likelihood estimation (based on the sum of absolute differences between the truth and the estimated values). Further, our method's credible intervals for population size as a function of time cover 90 percent of true values across multiple demographic scenarios, enabling formal hypothesis testing about population size differences over time. Using genealogies estimated with ARGweaver, we apply our method to European and Yoruban samples from the 1000 Genomes Project and confirm key known aspects of population size history over the past 150,000 years.

Download data

  • Downloaded 860 times
  • Download rankings, all-time:
    • Site-wide: 33,002
    • In genetics: 1,473
  • Year to date:
    • Site-wide: 142,913
  • Since beginning of last month:
    • Site-wide: 155,663

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


PanLingua

News