Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 70,955 bioRxiv papers from 309,680 authors.

BITACORA: A comprehensive tool for the identification and annotation of gene families in genome assemblies

By Joel Vizueta, Alejandro Sánchez-Gracia, Julio Rozas

Posted 19 Sep 2019
bioRxiv DOI: 10.1101/593889

Gene annotation is a critical bottleneck in genomic research, especially for the comprehensive study of very large gene families in the genomes of non-model organisms. Despite the recent progress in automatic methods, the tools developed for this task often produce inaccurate annotations, such as fused, chimeric, partial or even completely absent gene models for many family copies, which require considerable extra efforts to be amended. Here we present BITACORA, a bioinformatics solution that integrates sequence similarity search tools and Perl scripts to facilitate both the curation of these inaccurate annotations and the identification of previously undetected gene family copies directly from DNA sequences. We tested the performance of the BITACORA pipeline in annotating the members of two chemosensory gene families of different sizes in seven available chelicerate genome drafts. Despite the relatively high fragmentation of some of these drafts, BITACORA was able to improve the annotation of many members of these families and detected thousands of new chemoreceptors encoded in genome sequences. The program generates an output file in the general feature format (GFF) files, with both curated and novel gene models, and a FASTA file with the predicted proteins. These outputs can be easily integrated in genomic annotation editors, greatly facilitating subsequent manual annotation and downstream evolutionary analyses.

Download data

  • Downloaded 209 times
  • Download rankings, all-time:
    • Site-wide: 50,758 out of 70,955
    • In bioinformatics: 5,603 out of 6,942
  • Year to date:
    • Site-wide: 15,964 out of 70,955
  • Since beginning of last month:
    • Site-wide: 22,199 out of 70,955

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


PanLingua

Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News