Modeling physician variability to prioritize relevant medical record information
Gregory F Cooper,
Andrew J King,
Dean F Sittig,
Posted 20 Sep 2020
medRxiv DOI: 10.1101/2020.09.18.20197434
Posted 20 Sep 2020
ObjectivePatient information can be retrieved more efficiently in electronic medical record (EMR) systems by using machine learning models that predict which information a physician will seek in a clinical context. However, information-seeking behavior varies across EMR users. To explicitly account for this variability, we derived hierarchical models and compared their performance to non-hierarchical models in identifying relevant patient information in intensive care unit (ICU) cases. Materials and MethodsCritical care physicians reviewed ICU patient cases and selected data items relevant for presenting at morning rounds. Using patient EMR data as predictors, we derived hierarchical logistic regression (HLR) and standard logistic regression (LR) models to predict their relevance. ResultsIn 73 pairs of HLR and LR models, the HLR models achieved an area under the ROC curve of 0.81, 95% CI [0.80, 0.82], which was statistically significantly higher than that of LR models (0.75, 95% CI [0.74-0.76]). Further, the HLR models achieved statistically significantly lower expected calibration error (0.07, 95% CI [0.06-0.08]) than LR models (0.16, 95% CI [0.14-0.17]). DiscussionThe physician reviewers demonstrated variability in selecting relevant data. Our results show that HLR models perform significantly better than LR models with respect to both discrimination and calibration. This is likely due to explicitly modeling physician-related variability. ConclusionHierarchical models can yield better performance when there is physician-related variability as in the case of identifying relevant information in the EMR.
- Downloaded 228 times
- Download rankings, all-time:
- Site-wide: 125,538
- In health informatics: 520
- Year to date:
- Site-wide: 97,569
- Since beginning of last month:
- Site-wide: 94,850
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!