Rxivist logo

Nationwide prediction of type 2 diabetes comorbidities

By Piotr Dworzynski, Martin Aasbrenn, Klaus Rostgaard, Mads Melbye, Thomas Alexander Gerds, Henrik Hjalgrim, Tune H Pers

Posted 14 Jun 2019
bioRxiv DOI: 10.1101/664722 (published DOI: 10.1038/s41598-020-58601-7)

Identification of individuals at risk of developing disease comorbidities represents an important task in tackling the growing personal and societal burdens associated with chronic diseases. We employed machine learning techniques to investigate to what extent data from longitudinal, nationwide Danish health registers can be used to predict individuals at high risk of developing type 2 diabetes (T2D) comorbidities. Based on register data spanning hospitalizations, drug prescriptions and contacts with primary health contractors from >200,000 individuals newly diagnosed with T2D, we used logistic regression-, random forest- and gradient boosting models to predict five-year risk of heart failure (HF), myocardial infarction (MI), stroke (ST), cardiovascular disease (CVD) and chronic kidney disease (CKD). For HF, MI, CVD, and CKD, register-based models outperformed a reference model leveraging canonical individual characteristics by achieving an area under the receiver operating characteristic curve improvements of 0.06, 0.03, 0.06, and 0.07, respectively. The top 1,000 patients predicted to be at highest risk exhibited observed incidence ratios exceeding 4.99, 3.52, 2.92 and 4.71, respectively. In summary, prediction of T2D comorbidities utilizing Danish registers led to consistent albeit modest performance improvements over reference models, suggesting that register data could be leveraged to systematically identify individuals at risk of developing disease comorbidities.

Download data

  • Downloaded 338 times
  • Download rankings, all-time:
    • Site-wide: 86,819
    • In bioinformatics: 7,789
  • Year to date:
    • Site-wide: 117,933
  • Since beginning of last month:
    • Site-wide: 138,672

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide