Rxivist logo

Scaling read aligners to hundreds of threads on general-purpose processors

By Ben Langmead, Christopher Wilks, Valentin Antonescu, Rone Charles

Posted 24 Oct 2017
bioRxiv DOI: 10.1101/205328 (published DOI: 10.1093/bioinformatics/bty648)

General-purpose processors can now contain many dozens of processor cores and support hundreds of simultaneous threads of execution. To make best use of these threads, genomics software must contend with new and subtle computer architecture issues. We discuss some of these and propose methods for improving thread scaling in tools that analyze each read independently, such as read aligners. We implement these methods in new versions of Bowtie, Bowtie 2 and HISAT. We greatly improve thread scaling in many scenarios, including on the recent Intel Xeon Phi architecture. We also highlight how bottlenecks are exacerbated by variable-record-length file formats like FASTQ and suggest changes that enable superior scaling.

Download data

  • Downloaded 2,091 times
  • Download rankings, all-time:
    • Site-wide: 3,371 out of 88,857
    • In bioinformatics: 624 out of 8,400
  • Year to date:
    • Site-wide: 30,040 out of 88,857
  • Since beginning of last month:
    • Site-wide: 19,787 out of 88,857

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)