Rxivist logo

Developing and Deploying a Scalable Computing Platform to Support MOOC Education in Clinical Data Science

By David Mayer, Seth Russell, Melissa P Wilson, Michael G Kahn, Laura K Wiley

Posted 27 Aug 2020
bioRxiv DOI: 10.1101/2020.08.27.270009

One of the challenges of teaching applied data science courses is managing individual students local computing environment. This is especially challenging when teaching massively open online courses (MOOCs) where students come from across the globe and have a variety of access to and types of computing systems. There are additional challenges with using sensitive health information for clinical data science education. Here we describe the development and performance of a computing platform developed to support a series of MOOCs in clinical data science. This platform was designed to restrict and log all access to health datasets while also being scalable, accessible, secure, privacy preserving, and easy to access. Over the 19 months the platform has been live it has supported the computation of more than 2300 students from 101 countries. ### Competing Interest Statement Computing platform use and development costs were supported by our partnership with Google Cloud Healthcare. Outside of the choice to use Google Cloud Platform for hosting the platform these resources informed, but did not dictate, the final computing platform created.

Download data

  • Downloaded 126 times
  • Download rankings, all-time:
    • Site-wide: 89,220 out of 100,745
    • In scientific communication and education: 627 out of 670
  • Year to date:
    • Site-wide: 48,265 out of 100,745
  • Since beginning of last month:
    • Site-wide: 6,125 out of 100,745

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


  • 20 Oct 2020: Support for sorting preprints using Twitter activity has been removed, at least temporarily, until a new source of social media activity data becomes available.
  • 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
  • 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
  • 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
  • 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
  • 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
  • 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
  • 22 Jan 2019: Nature just published an article about Rxivist and our data.
  • 13 Jan 2019: The Rxivist preprint is live!