Rxivist logo

Automated Contamination Detection in Single-Cell Sequencing

By Markus Lux, Barbara Hammer, Alexander Sczyrba

Posted 15 Jun 2015
bioRxiv DOI: 10.1101/020859

Novel methods for the sequencing of single-cell DNA offer tremendous opportunities. However, many techniques are still in their infancy and a major obstacle is given by sample contamination with foreign DNA. In this contribution, we present a pipeline that allows for fast, automated detection of contaminated samples by the use of modern machine learning methods. First, a vectorial representation of the genomic data is obtained using oligonucleotide signatures. Using non-linear subspace projections, data is transformed to be suitable for automatic clustering. This allows for the detection of one vs. more genomes (clusters) in a sample. As clustering is an ill-posed problem, the pipeline relies on a thorough choice of all involved methods and parameters. We give an overview of the problem and evaluate techniques suitable for this task.

Download data

  • Downloaded 838 times
  • Download rankings, all-time:
    • Site-wide: 15,805 out of 92,757
    • In bioinformatics: 2,437 out of 8,680
  • Year to date:
    • Site-wide: 52,825 out of 92,757
  • Since beginning of last month:
    • Site-wide: 63,262 out of 92,757

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)