Machine Learning to Predict Osteoporotic Fracture Risk from Genotypes
John A. Morris,
John A Kanis,
Douglas P Kiel,
Eugene V McCloskey,
David M Evans,
William D. Leslie,
Celia MT Greenwood,
J Brent Richards
Posted 11 Sep 2018
bioRxiv DOI: 10.1101/413716
Posted 11 Sep 2018
Background: Genomics-based prediction could be useful since genome-wide genotyping costs less than many clinical tests. We tested whether machine learning methods could provide a clinically-relevant genomic prediction of quantitative ultrasound speed of sound (SOS) — a risk factor for osteoporotic fracture. Methods: We used 341,449 individuals from UK Biobank with SOS measures to develop genomically-predicted SOS (gSOS) using machine learning algorithms. We selected the optimal algorithm in 5,335 independent individuals and then validated it and its ability to predict incident fracture in an independent test dataset (N = 80,027). Finally, we explored whether genomic prescreening could complement a UK-based osteoporosis screening strategy, based on the validated tool FRAX. Results: gSOS explained 4.8-fold more variance in SOS than FRAX clinical risk factors (CRF) alone (r2 = 23% vs. 4.8%). A standard deviation decrease in gSOS, adjusting for the CRF-FRAX score was associated with a higher increased odds of incident major osteoporotic fracture (1,491 cases / 78,536 controls, OR = 1.91 [1.70-2.14], P = 10-28) than that for measured SOS (OR = 1.60 [1.50-1.69], P = 10-52) and femoral neck bone mineral density (147 cases / 4,594 controls, OR = 1.53 [1.27-1.83], P = 10-6). Individuals in the bottom decile of the gSOS distribution had a 3.25-fold increased risk of major osteoporotic fracture (P = 10-18) compared to the top decile. A gSOS-based FRAX score, identified individuals at high risk for incident major osteoporotic fractures better than the CRF-FRAX score (P = 10-14). Introducing a genomic prescreening step into osteoporosis screening in 4,741 individuals reduced the number of required clinical visits from 2,455 to 1,273 and the number of BMD tests from 1,013 to 473, while only reducing the sensitivity to identify individuals eligible for therapy from 99% to 95%. Interpretation: The use of genotypes in a machine learning algorithm resulted in a clinically relevant prediction of SOS and fracture, with potential to impact healthcare resource utilization.
- Downloaded 1,640 times
- Download rankings, all-time:
- Site-wide: 14,012
- In genomics: 1,354
- Year to date:
- Site-wide: 17,933
- Since beginning of last month:
- Site-wide: 47,667
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!