Rxivist logo

FastSpar: Rapid and scalable correlation estimation for compositional data

By Stephen C Watts, Scott C Ritchie, Michael Inouye, Kathryn E Holt

Posted 03 Mar 2018
bioRxiv DOI: 10.1101/272583 (published DOI: 10.1093/bioinformatics/bty734)

A common goal of microbiome studies is the elucidation of community composition and member interactions using counts of taxonomic units extracted from sequence data. Inference of interaction networks from sparse and compositional data requires specialised statistical approaches. A popular solution is SparCC, however its performance limits the calculation of interaction networks for very high-dimensional datasets. Here we introduce FastSpar, an efficient and parallelisable implementation of the SparCC algorithm which rapidly infers correlation networks and calculates p-values using an unbiased estimator. We further demonstrate that FastSpar reduces network inference wall time by 2-3 orders of magnitude compared to SparCC. FastSpar source code, precompiled binaries, and platform packages are freely available on GitHub: github.com/scwatts/FastSpar

Download data

  • Downloaded 734 times
  • Download rankings, all-time:
    • Site-wide: 33,578
    • In bioinformatics: 3,743
  • Year to date:
    • Site-wide: 72,651
  • Since beginning of last month:
    • Site-wide: 98,108

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide