Rxivist logo

Deep-learning-based cell composition analysis from tissue expression profiles

By Kevin Menden, Mohamed Marouf, Sergio Oller, Anupriya Dalmia, Karin Kloiber, Peter Heutink, Stefan Bonn

Posted 04 Jun 2019
bioRxiv DOI: 10.1101/659227 (published DOI: 10.1126/sciadv.aba2619)

We present Scaden, a deep neural network for cell deconvolution that uses gene expression information to infer the cellular composition of tissues. Scaden is trained on single cell RNA-seq data to engineer discriminative features that confer robustness to bias and noise, making complex data preprocessing and feature selection unnecessary. We demonstrate that Scaden outperforms existing deconvolution algorithms in both precision and robustness. A single trained network reliably deconvolves bulk RNA-seq and microarray, human and mouse tissue expression data and leverages the combined information of multiple data sets. Due to this stability and flexibility, we surmise that deep learning will become an algorithmic mainstay for cell deconvolution of various data types. Scaden’s comprehensive software package is easy to use on novel as well as diverse existing expression datasets available in public resources, deepening the molecular and cellular understanding of developmental and disease processes. * RNA-seq : Next Generation RNA Sequencing GEP : gene expression profile matrix SVR : Support Vector Regression DNN : Deep Neural Network scRNA-seq : single cell RNA-seq simulated tissue : training data generated by mixing proportions of scRNA-seq data PBMC : peripheral blood mononuclear cells CCC : concordance correlation coefficient r : Pearson’s correlation coefficient CS : CIBERSORT CSx : CIBERSORTx CPM : Cell Population Mapping

Download data

  • Downloaded 2,316 times
  • Download rankings, all-time:
    • Site-wide: 8,354
    • In bioinformatics: 908
  • Year to date:
    • Site-wide: 47,246
  • Since beginning of last month:
    • Site-wide: 96,375

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide