Rxivist logo

Emergency department admissions during COVID-19: explainable machine learning to characterise data drift and detect emergent health risks

By Christopher Duckworth, Francis P Chmiel, Dan K Burns, Zlatko D Zlatev, Neil M White, Michael Kiuber, Thomas W V Daniels, Michael J Boniface

Posted 29 May 2021
medRxiv DOI: 10.1101/2021.05.27.21257713

Supervised machine learning algorithms deployed in acute healthcare settings use data describing historical episodes to predict clinical outcomes. Clinical settings are dynamic environments and the underlying data distributions characterising episodes can change with time (a phenomenon known as data drift), and so can the relationship between episode characteristics and associated clinical outcomes (so-called, concept drift). We demonstrate how explainable machine learning can be used to monitor data drift in a predictive model deployed within a hospital emergency department. We use the COVID-19 pandemic as an exemplar cause of data drift, which has brought a severe change in operational circumstances. We present a machine learning classifier trained using (pre-COVID-19) data, to identify patients at high risk of admission to hospital during an emergency department attendance. We evaluate our models performance on attendances occurring pre-pandemic (AUROC 0.856 95%CI [0.852, 0.859]) and during the COVID-19 pandemic (AUROC 0.826 95%CI [0.814, 0.837]). We demonstrate two benefits of explainable machine learning (SHAP) for models deployed in healthcare settings: (1) By tracking the variation in a features SHAP value relative to its global importance, a complimentary measure of data drift is found which highlights the need to retrain a predictive model. (2) By observing the relative changes in feature importance emergent health risks can be identified.

Download data

  • Downloaded 236 times
  • Download rankings, all-time:
    • Site-wide: 113,057
    • In emergency medicine: 128
  • Year to date:
    • Site-wide: 24,581
  • Since beginning of last month:
    • Site-wide: 18,431

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide