Rxivist logo

LTMG: A novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data

By Changlin Wan, Wennan Chang, Yu Zhang, Fenil Shah, Xiaoyu Lu, Yong Zang, Anru Zhang, Sha Cao, Melissa L. Fishel, Qin Ma, Chi Zhang

Posted 29 Sep 2018
bioRxiv DOI: 10.1101/430009 (published DOI: 10.1093/nar/gkz655)

A key challenge in modeling single-cell RNA-seq (scRNA-seq) data is to capture the diverse gene expression states regulated by different transcriptional regulatory inputs across single cells, which is further complicated by a large number of observed zero and low expressions. We developed a left truncated mixture Gaussian (LTMG) model that stems from the kinetic relationships between the transcriptional regulatory inputs and metabolism of mRNA and gene expression abundance in a cell. LTMG infers the expression multi-modalities across single cell entities, representing a gene’s diverse expression states; meanwhile the dropouts and low expressions are treated as left truncated, specifically representing an expression state that is under suppression. We demonstrated that LTMG has significantly better goodness of fitting on an extensive number of single-cell data sets, comparing to three other state of the art models. In addition, our systems kinetic approach of handling the low and zero expressions and correctness of the identified multimodality are validated on several independent experimental data sets. Application on data of complex tissues demonstrated the capability of LTMG in extracting varied expression states specific to cell types or cell functions. Based on LTMG, a differential gene expression test and a co-regulation module identification method, namely LTMG-DGE and LTMG-GCR, are further developed. We experimentally validated that LTMG-DGE is equipped with higher sensitivity and specificity in detecting differentially expressed genes, compared with other five popular methods, and that LTMG-GCR is capable to retrieve the gene co-regulation modules corresponding to perturbed transcriptional regulations. A user-friendly R package with all the analysis power is available at <https://github.com/zy26/LTMGSCA>.

Download data

  • Downloaded 614 times
  • Download rankings, all-time:
    • Site-wide: 44,712
    • In bioinformatics: 4,712
  • Year to date:
    • Site-wide: 61,031
  • Since beginning of last month:
    • Site-wide: 36,684

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide