Rxivist logo

Dissecting tumor cell programs through group biology estimation in clinical single-cell transcriptomics

By Shreya Johri, Kevin Bi, Breanna M. Titchen, Jingxin Fu, Jake Conway, Jett P Crowdis, Natalie I. Volkes, Zenghua Fan, Lawrence Fong, Jihye Park, David R Liu, Meng Xiao He, Eliezer Van Allen

Posted 24 Oct 2021
bioRxiv DOI: 10.1101/2021.10.22.465130

Given the growing number of clinically integrated cancer single-cell transcriptomic studies, robust differential enrichment methods for gene signatures to dissect tumor cellular states for discovery and translation are critical. Current analysis strategies neither adequately represent the hierarchical structure of clinical single-cell transcriptomic datasets nor account for the variability in the number of recovered cells per sample, leading to results potentially confounded by sample-driven biology with high false positives instead of accurately representing true differential enrichment of group-level biology (e.g., treatment responders vs. non-responders). This problem is especially prominent for single-cell analyses of the tumor compartment, because high intra-patient similarity (as opposed to inter-patient similarity) results in stricter hierarchical structured data that confounds enrichment analysis. Furthermore, to identify signatures which are truly representative of the entire group, there is a need to quantify the robustness of otherwise statistically significant signatures to sample exclusion. Here, we present a new nonparametric statistical method, BEANIE, to account for these issues, and demonstrate its utility in two cancer cohorts stratified by clinical groups to reduce biological hypotheses and guide translational investigations. Using BEANIE, we show how the consideration of sample-specific versus group biology greatly decreases the false positive rate and guides identification of robust signatures that can also be corroborated across different cell type compartments.

Download data

  • Downloaded 381 times
  • Download rankings, all-time:
    • Site-wide: 99,922
    • In bioinformatics: 8,568
  • Year to date:
    • Site-wide: 8,510
  • Since beginning of last month:
    • Site-wide: 17,712

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide