Early prediction of high risk gestational diabetes mellitus via machine learning models.
AimsGestational diabetes mellitus (GDM) is a pregnancy-specific disorder that can usually be diagnosed after 24 gestational weeks. So far, there is no accurate method to predict GDM in early pregnancy. MethodsWe collected data extracted from the hospitals electronic medical record system included 73 features in the first trimester. We also recorded the occurrence of GDM, diagnosed at 24-28 weeks of pregnancy. We conducted a feature selection method to select a panel of most discriminative features. We then developed advanced machine learning models, using Deep Neural Network (DNN), Support Vector Machine (SVM), K-Nearest Neighboring (KNN), and Logistic Regression (LR), based on these features. ResultsWe studied 16,819 women (2,696 GDM) and 14,992 women (1,837 GDM) for the training and validation group. DNN, SVM, KNN, and LR models based on the 73-feature set demonstrated the best discriminative power with corresponding area under the curve (AUC) values of 0.92 (95%CI 0.91, 0.93), 0.82 (95%CI 0.81, 0.83), 0.63 (95%CI 0.62, 0.64), and 0.85 (95%CI 0.84, 0.85), respectively. The 7-feature (selected from the 73-feature set) DNN, SVM, KNN, and LR models had the best discriminative power with corresponding AUCs of 0.84 (95%CI 0.83, 0.84), 0.69 (95%CI 0.68, 0.70), 0.68 (95%CI 0.67, 0.69), and 0.84 (95% CI 0.83, 0.85), respectively. The 7-feature LR model had the best Hosmer-Lemeshow test outcome. Notably, the AUCs of the existing prediction models did not exceed 0.75. ConclusionsOur feature selection and machine learning models showed superior predictive power in early GDM detection than previous methods; these improved models will better serve clinical practices in preventing GDM. Research in Context sectionO_ST_ABSEvidence before this studyC_ST_ABSO_LIA hysteretic diagnosis of GDM in the 3rd trimester is too late to prevent exposure of the embryos or fetuses to an intrauterine hyperglycemia environment during early pregnancy. C_LIO_LIPrediction models for gestational diabetes are not uncommon in previous literature reports, but laboratory indicators are rarely involved in predictive indicators. C_LIO_LIThe penetration of AI into the medical field makes us want to introduce it into GDM predictive models. C_LI What is the key question?Whether the GDM prediction model established by machine learning has the ability to surpass the traditional LR model? Added value of this studyO_LIUsing machine learning to select features is an effective method. C_LIO_LIDNN prediction model have effective discrimination power for predicting GDM in early pregnancy, but it cannot completely replace LR. KNN and SVM are even worse than LR in this study. C_LI Implications of all the available evidenceThe biggest significance of our research is not only to build a prediction model that surpasses previous ones, but also to demonstrate the advantages and disadvantages of different machine learning methods through a practical case.
- Downloaded 250 times
- Download rankings, all-time:
- Site-wide: 109,625
- In endocrinology: 87
- Year to date:
- Site-wide: 85,842
- Since beginning of last month:
- Site-wide: 29,453
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!