Phylofactorization - a graph partitioning algorithm to identify phylogenetic scales of ecological data
Alex D. Washburne,
Justin D Silverman,
James T. Morton,
Daniel J. Becker,
Lawrence A David,
Raina K. Plowright
Posted 16 Dec 2017
bioRxiv DOI: 10.1101/235341 (published DOI: 10.1002/ecm.1353)
Posted 16 Dec 2017
The problem of pattern and scale is a central challenge in ecology (sensu Levin, 1992). The problem of scale is central to community ecology, where functional ecological groups are aggregated and treated as a unit underlying an ecological pattern, such as aggregation of nitrogen fixing trees into a total abundance of a trait underlying ecosystem physiology. With the emergence of massive community ecological datasets, from microbiomes to breeding bird surveys, there is a need to objectively identify the scales of organization pertaining to well-defined patterns in community ecological data. The phylogeny is a scaffold for identifying key phylogenetic scales associated with macroscopic patterns. Phylofactorization was developed to objectively identify phylogenetic scales underlying patterns in relative abundance data. However, many ecological data, such as presence-absences and counts, are not relative abundances, yet the logic of defining phylogenetic scales underlying a pattern of interest is still applicable. Here, we generalize phylofactorization beyond relative abundances to a graph-partitioning algorithm for traits and community-ecological data from any exponential-family distribution. Generalizing phylofactorization yields many tools for analyzing community ecological data. In the context of generalized phylofactorization, we identify three phylogenetic factors of mammalian body mass which arose during the K-Pg extinction event, consistent with other analyses of mammalian body mass evolution. We introduce a phylogenetic analysis of variance which refines our understanding of the major sources of variation in the human gut. We employ generalized additive modeling of microbes in central park soils to confirm that a large clade of Acidobacteria thrive in neutral soils. We demonstrate how to extend phylofactorization to generalized linear and additive modeling of any dataset of exponential family random variables. We finish with a discussion of how phylofactorization produces a novel species concept, a hybrid of a phylogenetic and ecological species concepts in which the phylogenetic scales and units of interest are defined objectively by defining the ecological pattern and partitioning the phylogeny into clades based on different contributions to the pattern. All of these tools can be implemented with a new R package available online.
- Downloaded 913 times
- Download rankings, all-time:
- Site-wide: 12,463 out of 84,782
- In ecology: 297 out of 3,718
- Year to date:
- Site-wide: 30,475 out of 84,782
- Since beginning of last month:
- Site-wide: 19,115 out of 84,782
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!