Rxivist logo

Phylofactorization - a graph partitioning algorithm to identify phylogenetic scales of ecological data

By Alex D. Washburne, Justin D Silverman, James T. Morton, Daniel J. Becker, Daniel Crowley, Sayan Mukherjee, Lawrence A David, Raina K. Plowright

Posted 16 Dec 2017
bioRxiv DOI: 10.1101/235341 (published DOI: 10.1002/ecm.1353)

The problem of pattern and scale is a central challenge in ecology (sensu Levin, 1992). The problem of scale is central to community ecology, where functional ecological groups are aggregated and treated as a unit underlying an ecological pattern, such as aggregation of nitrogen fixing trees into a total abundance of a trait underlying ecosystem physiology. With the emergence of massive community ecological datasets, from microbiomes to breeding bird surveys, there is a need to objectively identify the scales of organization pertaining to well-defined patterns in community ecological data. The phylogeny is a scaffold for identifying key phylogenetic scales associated with macroscopic patterns. Phylofactorization was developed to objectively identify phylogenetic scales underlying patterns in relative abundance data. However, many ecological data, such as presence-absences and counts, are not relative abundances, yet the logic of defining phylogenetic scales underlying a pattern of interest is still applicable. Here, we generalize phylofactorization beyond relative abundances to a graph-partitioning algorithm for traits and community-ecological data from any exponential-family distribution. Generalizing phylofactorization yields many tools for analyzing community ecological data. In the context of generalized phylofactorization, we identify three phylogenetic factors of mammalian body mass which arose during the K-Pg extinction event, consistent with other analyses of mammalian body mass evolution. We introduce a phylogenetic analysis of variance which refines our understanding of the major sources of variation in the human gut. We employ generalized additive modeling of microbes in central park soils to confirm that a large clade of Acidobacteria thrive in neutral soils. We demonstrate how to extend phylofactorization to generalized linear and additive modeling of any dataset of exponential family random variables. We finish with a discussion of how phylofactorization produces a novel species concept, a hybrid of a phylogenetic and ecological species concepts in which the phylogenetic scales and units of interest are defined objectively by defining the ecological pattern and partitioning the phylogeny into clades based on different contributions to the pattern. All of these tools can be implemented with a new R package available online.

Download data

  • Downloaded 913 times
  • Download rankings, all-time:
    • Site-wide: 12,463 out of 84,782
    • In ecology: 297 out of 3,718
  • Year to date:
    • Site-wide: 30,475 out of 84,782
  • Since beginning of last month:
    • Site-wide: 19,115 out of 84,782

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)