Unification of miRNA and isomiR research: the mirGFF3 format and the mirtop API
Aristeidis G. Telonis,
Marc R. Friedländer,
John H. Postlethwait,
Ioannis S Vlachos,
Marc K. Halushka,
Posted 25 Dec 2018
bioRxiv DOI: 10.1101/505222 (published DOI: 10.1093/bioinformatics/btz675)
Posted 25 Dec 2018
Background: MicroRNAs (miRNAs) are small RNA molecules (~22 nucleotide long) involved in post-transcriptional gene regulation. Advances in high-throughput sequencing technologies led to the discovery of isomiRs, which are miRNA sequence variants. While many miRNA-seq analysis tools exist, a lack of consensus on miRNA/isomiR analyses exists, and the resulting diversity of output formats hinders accurate comparisons between tools and precludes data sharing and the development of common downstream analysis methods. Findings: To overcome this situation, we present here a community-based project, miRTOP (miRNA Transcriptomic Open Project) working towards the optimization of miRNA analyses. The aim of miRTOP is to promote the development of downstream analysis tools that are compatible with any existing detection and quantification tool. Based on the existing GFF3 format, we first created a new standard format, mirGFF3, for the output of miRNA/isomiR detection and quantification results from small RNA-seq data. Additionally, we developed a command line Python tool, mirtop, to manage the mirGFF3 format. Currently, mirtop can convert into mirGFF3 the outputs of commonly used pipelines, such as seqbuster, miRge2.0, isomiR-SEA, sRNAbench, and Prost!, as well as BAM files. Its open architecture enables any tool or pipeline to output results in mirGFF3. Conclusions: Collectively a comprehensive isomiR categorization system, along with the accompanying mirGFF3 and mirtop API provide a complete solution for the standardization of miRNA and isomiR analysis, enabling data sharing, reporting, comparative analyses, and benchmarking, while promoting the development of common miRNA methods focusing on downstream steps to miRNA detection, annotation, and quantification.
- Downloaded 721 times
- Download rankings, all-time:
- Site-wide: 22,270 out of 101,463
- In bioinformatics: 3,149 out of 9,301
- Year to date:
- Site-wide: 24,693 out of 101,463
- Since beginning of last month:
- Site-wide: 45,050 out of 101,463
Downloads over time
Distribution of downloads per paper, site-wide
- 20 Oct 2020: Support for sorting preprints using Twitter activity has been removed, at least temporarily, until a new source of social media activity data becomes available.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!