Rxivist logo

Automated Contamination Detection in Single-Cell Sequencing

By Markus Lux, Barbara Hammer, Alexander Sczyrba

Posted 15 Jun 2015
bioRxiv DOI: 10.1101/020859

Novel methods for the sequencing of single-cell DNA offer tremendous opportunities. However, many techniques are still in their infancy and a major obstacle is given by sample contamination with foreign DNA. In this contribution, we present a pipeline that allows for fast, automated detection of contaminated samples by the use of modern machine learning methods. First, a vectorial representation of the genomic data is obtained using oligonucleotide signatures. Using non-linear subspace projections, data is transformed to be suitable for automatic clustering. This allows for the detection of one vs. more genomes (clusters) in a sample. As clustering is an ill-posed problem, the pipeline relies on a thorough choice of all involved methods and parameters. We give an overview of the problem and evaluate techniques suitable for this task.

Download data

  • Downloaded 826 times
  • Download rankings, all-time:
    • Site-wide: 14,503 out of 84,692
    • In bioinformatics: 2,262 out of 8,116
  • Year to date:
    • Site-wide: 42,117 out of 84,692
  • Since beginning of last month:
    • Site-wide: 80,749 out of 84,692

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)