Identification and prediction of ALS subgroups using machine learning
Michael A. Nalls,
Roy H Campbell,
Bryan J. Traynor,
Posted 07 Apr 2021
medRxiv DOI: 10.1101/2021.04.02.21254844
Posted 07 Apr 2021
Background The disease entity known as amyotrophic lateral sclerosis (ALS) is now known to represent a collection of overlapping syndromes. A better understanding of this heterogeneity and the ability to distinguish ALS subtypes would improve the clinical care of patients and enhance our understanding of the disease. Subtype profiles could be incorporated into the clinical trial design to improve our ability to detect a therapeutic effect. A variety of classification systems have been proposed over the years based on empirical observations, but it is unclear to what extent they genuinely reflect ALS population substructure. Methods We applied machine learning algorithms to a prospective, population-based cohort consisting of 2,858 Italian patients diagnosed with ALS for whom detailed clinical phenotype data were available. We replicated our findings in an independent population-based cohort of 1,097 Italian ALS patients. Findings We found that semi-supervised machine learning based on UMAP applied to the output of a multi-layered perceptron neural network produced the optimum clustering of the ALS patients in the discovery cohort. These clusters roughly corresponded to the six clinical subtypes defined by the Chio classification system (bulbar ALS, respiratory ALS, flail arm ALS, classical ALS, pyramidal ALS, and flail leg ALS). The same clusters were identified in the replication cohort. A supervised learning approach based on ensemble learning identified twelve clinical parameters that predicted ALS clinical subtype with high accuracy (area under the curve = 0.94). Interpretation Our data-driven study provides insight into the ALS population's substructure and demonstrates that the Chio classification system robustly identifies ALS subtypes. We provide an interactive website (https://share.streamlit.io/anant-dadu/machinelearningforals/main) so that clinical researchers can predict the clinical subtype of an ALS patient based on a small number of clinical parameters. Funding National Institute on Aging and the Italian Ministry of Health.
- Downloaded 327 times
- Download rankings, all-time:
- Site-wide: 114,181
- In neurology: 463
- Year to date:
- Site-wide: 14,126
- Since beginning of last month:
- Site-wide: 33,196
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!