Application of a Novel Machine Learning Method to Big Data Infers a Relationship Between Asthma and the Development of Neoplasia

By Abbas Shojaee, Jose L. Gomez, Xiaochen Wang, Naftali Kaminski, Jonathan M. Siner, Seyedtaghi Takyar, Hongyu Zhao, Geoffrey Chupp

Posted 11 Oct 2018
bioRxiv DOI: 10.1101/439117

Background: A relationship between asthma and the risk of having cancer has been identified in several studies. However, these studies have used different methodologies, been primarily cross-sectional in nature, and the results have been contradictory. Population-level analyses are required to determine if a relationship truly exists. Methods: We developed a novel machine learning tool to infer associations, Causal Inference using the Composition of Transactions (CICT). Two all payers claim datasets of over two hundred million hospitalization encounters from the US-based Healthcare Cost and Utilization Project (HCUP) were used for discovery and validation. Associations between asthma and neoplasms were discovered in data from the State of Florida. Validation was conducted on eight cohorts of patients with asthma, and seven subtypes of asthma and COPD using datasets from the State of California. Control groups were matched by gender, age, race, and history of tobacco use. Odds ratio analysis with Bonferroni-Holm correction measured the association of asthma and COPD with 26 different benign and malignant neoplasms. ICD9CM codes were used to identify exposures and outcomes. Findings: CICT identified 17 associations between asthma and the risk of neoplasia in the discovery dataset. In the validation studies, 208 case-control analyses were conducted between subtypes of Asthma (N= 999,370, male= 33%, age= 50) and COPD (N=715,971, male = 50%, age=69) with the corresponding matched control groups (N=8,400,004, male= 42%, age= 47). Allergic asthma was associated with benign neoplasms of the meninges, salivary, pituitary, parathyroid, and thyroid glands (OR:1.52 to 2.52), and malignant neoplasms of the breast, intrahepatic biliary system, hematopoietic, and lymphatic system (OR: 1.45 to 2.05). COPD was associated with malignant neoplasms in the lung, bladder, and hematopoietic systems. Interpretation: The combined use of machine learning methods for knowledge discovery and epidemiological methods shows that allergic asthma is associated with the development of neoplasia, including in glandular organs, ductal tissues, and hematopoietic systems. Also, our findings differentiate the pattern of neoplasms between allergic asthma and obstructive asthma. This suggests that inflammatory pathways that are active in asthma also contribute to neoplastic transformation in specific organ systems such as secretory organs. ### Competing Interest Statement The authors have declared no competing interest.

