Rxivist logo

Uncovering Medical Insights from Vast Amounts of Biomedical Data in Clinical Case Reports

By Yijiang Zhou, David A. Liem, Jessica M. Lee, Quan Cao, Brian Bleakley, J. Harry Caufield, Sanjana Murali, Wei Wang, Li Zhang, Alex Bui, Yizhou Sun, Karol E. Watson, Jiawei Han, Peipei Ping

Posted 04 Aug 2017
bioRxiv DOI: 10.1101/172460

Clinical case reports (CCRs) have a time-honored tradition in serving as an important means of sharing clinical experiences on patients presenting with atypical disease phenotypes or receiving new therapies. However, the huge amount of accumulated case reports are isolated, unstructured, and heterogeneous clinical data, posing a great challenge to clinicians and researchers in mining relevant information through existing indexing tools. In this investigation, in order to render CCRs more findable, accessible, interoperable, and reusable (FAIR) by the biomedical community, we created a resource platform, including the construction of a test dataset consisting of 1000 CCRs spanning 14 disease phenotypes, a standardized metadata template and metrics, and a set of computational tools to automatically retrieve relevant medical information and to analyze all published PubMed clinical case reports with respect to trends in publication journals, citations impact, MeSH Terms, drug use, distributions of patient demographics, and relationships with other case reports and databases. Our standardized metadata template and CCR test dataset may be valuable resources to advance medical science and improve patient care for researchers who are using machine learning approaches with a high-quality dataset to train and validate their algorithms. In the future, our analytical tools may be applied towards other large clinical data sources as well.

Download data

  • Downloaded 326 times
  • Download rankings, all-time:
    • Site-wide: 49,299 out of 88,882
    • In bioinformatics: 5,627 out of 8,401
  • Year to date:
    • Site-wide: 68,926 out of 88,882
  • Since beginning of last month:
    • Site-wide: 52,384 out of 88,882

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)