Rxivist logo

PyRanges: efficient comparison of genomic intervals in Python

By Endre Bakken Stovner, Pål Sætrom

Posted 16 Apr 2019
bioRxiv DOI: 10.1101/609396 (published DOI: 10.1093/bioinformatics/btz615)

Summary: Complex genomic analyses often use sequences of simple set operations like intersection, overlap, and nearest on genomic intervals. These operations, coupled with some custom programming, allow a wide range of analyses to be performed. To this end, we have written PyRanges, a data structure for representing and manipulating genomic intervals and their associated data in Python. Run single-threaded on binary set operations, PyRanges is in median 2.3-9.6 times faster than the popular R GenomicRanges library and is equally memory efficient; run multi-threaded on 8 cores, our library is up to 123 times faster. PyRanges is therefore ideally suited both for individual analyses and as a foundation for future genomic libraries in Python. Availability: PyRanges is available open-source under the MIT license at https://github.com/biocore-NTNU/pyranges and documentation exists at https://biocore-NTNU.github.io/pyranges/ Contact: endrebak85@gmail.com Supplementary information: Supplementary data are available.

Download data

  • Downloaded 689 times
  • Download rankings, all-time:
    • Site-wide: 21,777 out of 94,912
    • In genomics: 2,476 out of 5,955
  • Year to date:
    • Site-wide: 18,145 out of 94,912
  • Since beginning of last month:
    • Site-wide: 20,957 out of 94,912

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)