Knowledge synthesis from 100 million biomedical documents augments the deep expression profiling of coronavirus receptors
Dariusz K Murakowski,
Posted 29 Mar 2020
bioRxiv DOI: 10.1101/2020.03.24.005702 (published DOI: 10.7554/eLife.58040)
Posted 29 Mar 2020
The COVID-19 pandemic demands assimilation of all available biomedical knowledge to decode its mechanisms of pathogenicity and transmission. Despite the recent renaissance in unsupervised neural networks for decoding unstructured natural languages, a platform for the real-time synthesis of the exponentially growing biomedical literature and its comprehensive triangulation with deep omic insights is not available. Here, we present the nferX platform for dynamic inference from over 45 quadrillion possible conceptual associations extracted from unstructured biomedical text, and their triangulation with Single Cell RNA-sequencing based insights from over 25 tissues (https://academia.nferx.com/). Using this platform, we identify intersections between the pathologic manifestations of COVID-19 and the comprehensive expression profile of the SARS-CoV-2 receptor ACE2. We find that tongue keratinocytes, airway club cells, and ciliated cells are likely under-appreciated targets of SARS-CoV-2 infection, in addition to type II pneumocytes and olfactory epithelial cells. We further identify mature small intestinal enterocytes as a possible hotspot of COVID-19 fecal-oral transmission, where an intriguing maturation-correlated transcriptional signature is shared between ACE2 and the other coronavirus receptors DPP4 (MERS-CoV) and ANPEP (α-coronavirus). This study demonstrates how a holistic data science platform can leverage unprecedented quantities of structured and unstructured publicly available data to accelerate the generation of impactful biological insights and hypotheses.
- Downloaded 1,761 times
- Download rankings, all-time:
- Site-wide: 10,706
- In genomics: 1,099
- Year to date:
- Site-wide: 62,452
- Since beginning of last month:
- Site-wide: 26,120
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!