Rxivist logo

An Objective Search for Unrecognized Bias in Validated COVID-19 Prediction Models

By Hossein Estiri, Zachary Strasser, Sina Rashidian, Jeffrey Klann, Kavishwar Wagholikar, Thomas McCoy, Shawn Murphy

Posted 30 Oct 2021
medRxiv DOI: 10.1101/2021.10.28.21265629

The growing recognition of algorithmic bias has spurred discussions about fairness in artificial intelligence (AI) / machine learning (ML) algorithms. The increasing translation of predictive models into clinical practice brings an increased risk of direct harm from algorithmic bias; however, bias remains incompletely measured in many medical AI applications. Using data from over 56 thousand Mass General Brigham (MGB) patients with confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), we evaluate unrecognized bias in four AI models developed during the early months of the pandemic in Boston, Massachusetts that predict risks of hospital admission, ICU admission, mechanical ventilation, and death after a SARS-CoV-2 infection purely based on their pre-infection longitudinal medical records. We discuss that while a model can be biased against certain protected groups (i.e., perform worse) in certain tasks, it can be at the same time biased towards another protected group (i.e., perform better). As such, current bias evaluation studies may lack a full depiction of the variable effects of a model on its subpopulations. If the goal is to make a change in a positive way, the underlying roots of bias need to be fully explored in medical AI. Only a holistic evaluation, a diligent search for unrecognized bias, can provide enough information for an unbiased judgment of AI bias that can invigorate follow-up investigations on identifying the underlying roots of bias and ultimately make a change.

Download data

  • Downloaded 293 times
  • Download rankings, all-time:
    • Site-wide: 122,527
    • In health informatics: 519
  • Year to date:
    • Site-wide: 57,497
  • Since beginning of last month:
    • Site-wide: 19,568

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide