SAFE-clustering: Single-cell Aggregated (From Ensemble) Clustering for Single-cell RNA-seq Data
Motivation: Accurately clustering cell types from a mass of heterogeneous cells is a crucial first step for the analysis of single-cell RNA-seq (scRNA-Seq) data. Although several methods have been recently developed, they utilize different characteristics of data and yield varying results in terms of both the number of clusters and actual cluster assignments. Results: Here, we present SAFE-clustering, Single-cell Aggregated (From Ensemble) clustering, a flexible, accurate and robust method for clustering scRNA-Seq data. SAFE-clustering takes as input, results from multiple clustering methods, to build one consensus solution. SAFE-clustering currently embeds four state-of-the-art methods, SC3, CIDR, Seurat and t-SNE + k-means; and ensembles solutions from these four methods using three hypergraph-based partitioning algorithms. Extensive assessment across 12 datasets with the number of clusters ranging from 3 to 14, and the number of single cells ranging from 49 to 32,695 showcases the advantages of SAFE-clustering in terms of both cluster number (18.9 - 50.0% reduction in absolute deviation to the truth) and cluster assignment (on average 28.9% improvement, and up to 34.5% over the best of the four methods, measured by adjusted rand index). Moreover, SAFE-clustering is computationally efficient to accommodate large datasets, taking <10 minutes to process 28,733 cells. Availability and implementation: SAFE-clustering, including source codes and tutorial, is free available on the web at http://yunliweb.its.unc.edu/safe/.
- Downloaded 2,844 times
- Download rankings, all-time:
- Site-wide: 4,418
- In bioinformatics: 466
- Year to date:
- Site-wide: 30,774
- Since beginning of last month:
- Site-wide: 19,910
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!