Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 57,910 bioRxiv papers from 266,458 authors.

Automated Contamination Detection in Single-Cell Sequencing

By Markus Lux, Barbara Hammer, Alexander Sczyrba

Posted 15 Jun 2015
bioRxiv DOI: 10.1101/020859

Novel methods for the sequencing of single-cell DNA offer tremendous opportunities. However, many techniques are still in their infancy and a major obstacle is given by sample contamination with foreign DNA. In this contribution, we present a pipeline that allows for fast, automated detection of contaminated samples by the use of modern machine learning methods. First, a vectorial representation of the genomic data is obtained using oligonucleotide signatures. Using non-linear subspace projections, data is transformed to be suitable for automatic clustering. This allows for the detection of one vs. more genomes (clusters) in a sample. As clustering is an ill-posed problem, the pipeline relies on a thorough choice of all involved methods and parameters. We give an overview of the problem and evaluate techniques suitable for this task.

Download data

  • Downloaded 644 times
  • Download rankings, all-time:
    • Site-wide: 12,630 out of 57,910
    • In bioinformatics: 2,061 out of 5,899
  • Year to date:
    • Site-wide: 20,550 out of 57,910
  • Since beginning of last month:
    • Site-wide: 16,760 out of 57,910

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News