Rxivist logo

Continuous-trait probabilistic model for comparing multi-species functional genomic data

By Yang Yang, Quanquan Gu, Yang Zhang, Takayo Sasaki, Julianna Crivello, Rachel J. O’Neill, David M. Gilbert, Jian Ma

Posted 16 Mar 2018
bioRxiv DOI: 10.1101/283093 (published DOI: 10.1016/j.cels.2018.05.022)

A large amount of multi-species functional genomic data from high-throughput assays are becoming available to help understand the molecular mechanisms for phenotypic diversity across species. However, continuous-trait probabilistic models, which are key to such comparative analysis, remain under-explored. Here we develop a new model, called phylogenetic hidden Markov Gaussian processes (Phylo-HMGP), to simultaneously infer heterogeneous evolutionary states of functional genomic features in a genome-wide manner. Both simulation studies and real data application demonstrate the effectiveness of Phylo-HMGP. Importantly, we applied Phylo-HMGP to analyze a new cross-species DNA replication timing (RT) dataset from the same cell type in five primate species (human, chimpanzee, orangutan, gibbon, and green monkey). We demonstrate that our Phylo-HMGP model enables discovery of genomic regions with distinct evolutionary patterns of RT. Our method provides a generic framework for comparative analysis of multi-species continuous functional genomic signals to help reveal regions with conserved or lineage-specific regulatory roles.

Download data

  • Downloaded 666 times
  • Download rankings, all-time:
    • Site-wide: 70,107
    • In bioinformatics: 6,371
  • Year to date:
    • Site-wide: 166,844
  • Since beginning of last month:
    • Site-wide: 136,189

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide