Rxivist logo

An analytical framework for interpretable and generalizable 'quasilinear' single-cell data analysis

By Jian Zhou, Olga G Troyanskaya

Posted 13 Apr 2020
bioRxiv DOI: 10.1101/2020.04.12.022806

Scaling single-cell data exploratory analysis with the rapidly growing diversity and quantity of single-cell omics datasets demands more interpretable and robust data representation that is generalizable across datasets. To address this challenge, here we developed a novel 'quasilinear' framework that combines the interpretability and transferability of linear methods with the representational power of nonlinear methods. Within this framework, we introduce a data representation and visualization method, GraphDR, and a structure discovery method, StructDR, that unifies cluster, trajectory, and surface estimation and allows their confidence set inference. We applied both methods to diverse single-cell RNA-seq datasets from whole embryos and tissues. Unlike PCA and t-SNE, GraphDR and StructDR generated representations that both distinguished highly specific cell types and were comparable across datasets. In addition, GraphDR is at least an order of magnitude faster than commonly used nonlinear methods. Our visualizations of scRNA-seq data from developing zebrafish and Xenopus embryos revealed extruding branches of lineages from a continuum of cell states, suggesting that the current branch view of cell specification may be oversimplified. Moreover, StructDR identified a novel neuronal population using scRNA-seq data from mouse hippocampus. An open-source python library and a user-friendly graphical interface for 3D data visualization and analysis with these methods are available at https://github.com/jzthree/quasildr. ### Competing Interest Statement The authors have declared no competing interest.

Download data

  • Downloaded 785 times
  • Download rankings, all-time:
    • Site-wide: 42,087
    • In bioinformatics: 4,400
  • Year to date:
    • Site-wide: 90,367
  • Since beginning of last month:
    • Site-wide: 18,500

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide