Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 63,112 bioRxiv papers from 279,939 authors.

BITACORA: A comprehensive tool for the identification and annotation of gene families in genome assemblies

By Joel Vizueta, Alejandro Sanchez-Gracia, Julio Rozas

Posted 19 Sep 2019
bioRxiv DOI: 10.1101/593889

Gene annotation is a critical bottleneck in genomic research, especially for the comprehensive study of very large gene families in the genomes of non-model organisms. Despite the recent progress in automatic methods, the tools developed for this task often produce inaccurate annotations, such as fused, chimeric, partial or even completely absent gene models for many family copies, which require considerable extra efforts to be amended. Here we present BITACORA, a bioinformatics solution that integrates sequence similarity search tools and Perl scripts to facilitate both the curation of these inaccurate annotations and the identification of previously undetected gene family copies directly from DNA sequences. We tested the performance of the BITACORA pipeline in annotating the members of two chemosensory gene families of different sizes in seven available chelicerate genome drafts. Despite the relatively high fragmentation of some of these drafts, BITACORA was able to improve the annotation of many members of these families and detected thousands of new chemoreceptors encoded in genome sequences. The program generates an output file in the general feature format (GFF) files, with both curated and novel gene models, and a FASTA file with the predicted proteins. These outputs can be easily integrated in genomic annotation editors, greatly facilitating subsequent manual annotation and downstream evolutionary analyses.

Download data

  • Downloaded 107 times
  • Download rankings, all-time:
    • Site-wide: 56,435 out of 63,112
    • In bioinformatics: 5,877 out of 6,276
  • Year to date:
    • Site-wide: 38,610 out of 63,112
  • Since beginning of last month:
    • Site-wide: 3,336 out of 63,112

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide

Sign up for the Rxivist weekly newsletter! (Click here for more details.)