Motivation: The complexes formed by binding of proteins to RNAs play key roles in many biological processes, such as splicing, gene expression regulation, translation, and viral replication. Understanding protein-RNA binding may thus provide important insights to the functionality and dynamics of many cellular processes. This has sparked substantial interest in exploring protein-RNA binding experimentally, and predicting it computationally. The key computational challenge is to efficiently and accurately infer RNA-binding models that will enable prediction of novel protein-RNA interactions to additional transcripts of interest. Results: We developed DLPRB, a new deep neural network (DNN) approach for learning protein-RNA binding preferences and predicting novel interactions. We present two different network architectures: a convolutional neural network (CNN), and a recurrent neural network (RNN). The novelty of our network hinges upon two key aspects: (i) the joint analysis of both RNA sequence and structure, which is represented as a probability vector of different RNA structural contexts; (ii) novel features in the architecture of the networks, such as the application of RNNs to RNA-binding prediction, and the combination of hundreds of variable-length filters in the CNN. Our results in inferring accurate RNA-binding models from high-throughput in vitro data exhibit substantial improvements, compared to all previous approaches for protein-RNA binding prediction (both DNN and non-DNN based). A highly significant improvement is achieved for in vitro binding prediction, and a more modest, yet statistically significant, improvement for in vivo binding prediction. When incorporating experimentally-measured RNA structure compared to predicted one, the improvement on in vivo data increases. By visualizing the binding specificities, we can gain novel biological insights underlying the mechanism of protein RNA-binding. Availability: The source code is publicly available at https://github.com/ilanbb/dlprb. Contact: firstname.lastname@example.org
- Downloaded 863 times
- Download rankings, all-time:
- Site-wide: 11,795 out of 76,920
- In bioinformatics: 1,948 out of 7,431
- Year to date:
- Site-wide: 46,499 out of 76,920
- Since beginning of last month:
- Site-wide: 41,424 out of 76,920
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!