Motivation: Despite the existing classification and inference based machine learning methods that show promising results in drug target prediction, these methods possess inevitable limitations, where: 1) results are often biased as it lacks negative samples in the classification based methods, and 2) novel drug target associations with new (or isolated) drugs/targets cannot be explored by inference based methods. As big data continues to boom, there is a need to study a scalable, robust, and accurate solution that can process large heterogeneous datasets and yield valuable predictions. Results: We introduce a drug target prediction method that improved our previously proposed method from the three aspects: 1) we constructed a heterogeneous network which incorporates 12 repositories and includes 7 types of biomedical entities (#20,119 entities, #194,296 associations), 2) we enhanced the feature learning method with Node2Vec, a scalable state of the art feature learning method, 3) we integrate the originally proposed inference-based model with a classification model, which is further finetuned by a negative sample selection algorithm. The proposed method shows a better result for drug target association prediction: 95.3% AUC ROC score compared to the existing methods in the 10-fold cross-validation tests. We studied the biased learning/testing in the network-based pairwise prediction, and conclude a best training strategy. Finally, we conducted a disease specific prediction task based on 20 diseases. New drug-target associations were successfully predicted with AUC ROC in average, 97.2% (validated based on the DrugBank 5.1.0). The experiments showed the reliability of the proposed method in predicting novel drug-target associations for the disease treatment.
- Downloaded 545 times
- Download rankings, all-time:
- Site-wide: 42,827
- In bioinformatics: 4,617
- Year to date:
- Site-wide: 104,674
- Since beginning of last month:
- Site-wide: 66,175
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!