Rxivist logo

Automated literature mining and hypothesis generation through a network of Medical Subject Headings

By Stephen Joseph Wilson, Angela Dawn Wilkins, Matthew V. Holt, Byung Kwon Choi, Daniel Konecki, Chih-Hsu Lin, Amanda Koire, Yue Chen, Seon-Young Kim, Yi Wang, Brigitta Dewi Wastuwidyaningtyas, Jun Qin, Lawrence Allen Donehower, Olivier Lichtarge

Posted 29 Aug 2018
bioRxiv DOI: 10.1101/403667

The scientific literature is vast, growing, and increasingly specialized, making it difficult to connect disparate observations across subfields. To address this problem, we sought to develop automated hypothesis generation by networking at scale the MeSH terms curated by the National Library of Medicine. The result is a Mesh Term Objective Reasoning (MeTeOR) approach that tallies associations among genes, drugs and diseases from PubMed and predicts new ones. Comparisons to reference databases and algorithms show MeTeOR tends to be more reliable. We also show that many predictions based on the literature prior to 2014 were published subsequently. In a practical application, we validated experimentally a surprising new association found by MeTeOR between novel Epidermal Growth Factor Receptor (EGFR) associations and CDK2. We conclude that MeTeOR generates useful hypotheses from the literature (http://meteor.lichtargelab.org/).

Download data

  • Downloaded 821 times
  • Download rankings, all-time:
    • Site-wide: 34,928
    • In bioinformatics: 3,799
  • Year to date:
    • Site-wide: 59,123
  • Since beginning of last month:
    • Site-wide: 73,325

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide