Rxivist logo

Mining all publicly available expression data to compute dynamic microbial transcriptional regulatory networks

By Anand V. Sastry, Saugat Poudel, Kevin Rychel, Reo Yoo, Cameron R. Lamoureux, Siddharth M Chauhan, Zachary B Haiman, Tahani Al Bulushi, Yara Seif, Bernhard Palsson

Posted 02 Jul 2021
bioRxiv DOI: 10.1101/2021.07.01.450581

We are firmly in the era of biological big data. Millions of omics datasets are publicly accessible and can be employed to support scientific research or build a holistic view of an organism. Here, we introduce a workflow that converts all public gene expression data for a microbe into a dynamic representation of the organism's transcriptional regulatory network. This five-step process walks researchers through the mining, processing, curation, analysis, and characterization of all available expression data, using Bacillus subtilis as an example. The resulting reconstruction of the B. subtilis regulatory network can be leveraged to predict new regulons and analyze datasets in the context of all published data. The results are hosted at https://imodulondb.org/, and additional analyses can be performed using the PyModulon Python package. As the number of publicly available datasets increases, this pipeline will be applicable to a wide range of microbial pathogens and cell factories.

Download data

  • Downloaded 325 times
  • Download rankings, all-time:
    • Site-wide: 114,609
    • In bioinformatics: 9,446
  • Year to date:
    • Site-wide: 40,710
  • Since beginning of last month:
    • Site-wide: 17,716

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide