Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 60,239 bioRxiv papers from 267,831 authors.

Haplotype-aware graph indexes

By Jouni Sirén, Erik Garrison, Adam M Novak, Benedict Paten, Richard Durbin

Posted 24 Feb 2019
bioRxiv DOI: 10.1101/559583 (published DOI: 10.1093/bioinformatics/btz575)

Motivation: The variation graph toolkit (VG) represents genetic variation as a graph. Although each path in the graph is a potential haplotype, most paths are nonbiological, unlikely recombinations of true haplotypes. Results: We augment the VG model with haplotype information to identify which paths are more likely to exist in nature. For this purpose, we develop a scalable implementation of the graph extension of the positional Burrows--Wheeler transform (GBWT). We demonstrate the scalability of the new implementation by building a whole-genome index of the 5,008 haplotypes of the 1000 Genomes Project, and an index of all 108,070 TOPMed Freeze 5 chromosome 17 haplotypes. We also develop an algorithm for simplifying variation graphs for k-mer indexing without losing any k-mers in the haplotypes.

Download data

  • Downloaded 518 times
  • Download rankings, all-time:
    • Site-wide: 17,673 out of 60,239
    • In bioinformatics: 2,686 out of 6,078
  • Year to date:
    • Site-wide: 4,067 out of 60,239
  • Since beginning of last month:
    • Site-wide: 15,148 out of 60,239

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News