Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 52,258 bioRxiv papers from 242,323 authors.

Most downloaded bioRxiv papers, all time

in category epidemiology

1,474 results found. For more information, click each entry to expand.

1: Phenotypic Age: a novel signature of mortality and morbidity risk
more details view paper

Posted to bioRxiv 05 Jul 2018

Phenotypic Age: a novel signature of mortality and morbidity risk
6,710 downloads epidemiology

Zuyun Liu, Pei-Lun Kuo, Steve Horvath, Eileen Crimmins, Luigi Ferrucci, Morgan Levine

Background: A person's rate of aging has important implications for his/her risk of death and disease, thus, quantifying aging using observable characteristics has important applications for clinical, basic, and observational research. We aimed to validate a novel aging measure, 'Phenotypic Age', constructed based on routine clinical chemistry measures, by assessing its applicability for differentiating risk for morbidity and mortality in both healthy and unhealthy populations of various ages. Methods: A nationally representative US sample, NHANES III, was used to derive 'Phenotypic Age' based on a linear combination of chronological age and nine multi-system clinical chemistry measures, selected via cox proportional elastic net. Mortality predictions were validated using an independent sample (NHANES IV), consisting of 11,432 participants, for whom we observed a total of 871 deaths, ascertained over 12.6 year of follow-up. Proportional hazard models and ROC curves were used to evaluate predictions. Results: Phenotypic Age was significantly associated with all-cause mortality and cause-specific mortality. These results were robust to age and sex stratification, and remained even when excluding short-term mortality. Similarly, Phenotypic Age was associated with mortality among seemingly 'healthy' participants, defined as those who were disease-free and had normal BMI at baseline, as well as the oldest-old (aged 85+), a group with high disease burden. Conclusions: Phenotypic Age is a reliable predictor of all-cause and cause-specific mortality in multiple subgroups of the population. Risk stratification by this composite measure is far superior to that of the individual measures that go into it, as well as traditional measures of health. It is able to differentiate individuals who appear healthy, who may have otherwise been missed using traditional health assessments. Further, it can differentiate risk among persons with shared disease burden. Overall, this easily measured metric may be useful in the clinical setting and facilitate secondary and tertiary prevention strategies.

2: Clustering of adult-onset diabetes into novel subgroups guides therapy and improves prediction of outcome
more details view paper

Posted to bioRxiv 08 Sep 2017

Clustering of adult-onset diabetes into novel subgroups guides therapy and improves prediction of outcome
5,348 downloads epidemiology

Emma Ahlqvist, Petter Storm, Annemari Karajamaki, Mats Martinell, Mozhgan Dorkhan, Annelie Carlsson, Petter Vikman, Rashmi B. Prasad, Dina Mansour Aly, Peter Almgren, Ylva Wessman, Nael Shaat, Peter Spegel, Hindrik Mulder, Eero Lindholm, Olle Melander, Ola Hansson, Ulf Malmqvist, Ake Lernmark, Kaj Lahti, Tom Forsen, Tiinamaija Tuomi, Anders H. Rosengren, Leif Groop

Background: Diabetes is presently classified into two main forms, type 1 (T1D) and type 2 diabetes (T2D), but especially T2D is highly heterogeneous. A refined classification could provide a powerful tool individualize treatment regimes and identify individuals with increased risk of complications already at diagnosis. Methods: We applied data-driven cluster analysis (k-means and hierarchical clustering) in newly diagnosed diabetic patients (N=8,980) from the Swedish ANDIS (All New Diabetics in Scania) cohort, using five variables (GAD-antibodies, BMI, HbA1c, HOMA2-B and HOMA2-IR), and related to prospective data on development of complications and prescription of medication from patient records. Replication was performed in three independent cohorts: the Scania Diabetes Registry (SDR, N=1466), ANDIU (All New Diabetics in Uppsala, N=844) and DIREVA (Diabetes Registry Vaasa, N=3485). Cox regression and logistic regression was used to compare time to medication, time to reaching the treatment goal and risk of diabetic complications and genetic associations. Findings: We identified 5 replicable clusters of diabetes patients, with significantly different patient characteristics and risk of diabetic complications. Particularly, individuals in the most insulin-resistant cluster 3 had significantly higher risk of diabetic kidney disease, but had been prescribed similar diabetes treatment compared to the less susceptible individuals in clusters 4 and 5. The insulin deficient cluster 2 had the highest risk of retinopathy. In support of the clustering, genetic associations to the clusters differed from those seen in traditional T2D. Interpretation: We could stratify patients into five subgroups predicting disease progression and development of diabetic complications more precisely than the current classification. This new substratificationn may help to tailor and target early treatment to patients who would benefit most, thereby representing a first step towards precision medicine in diabetes.

3: Increased risk of many early-life diseases after surgical removal of adenoids and tonsils in childhood
more details view paper

Posted to bioRxiv 05 Jul 2017

Increased risk of many early-life diseases after surgical removal of adenoids and tonsils in childhood
4,534 downloads epidemiology

Sean G. Byars, Stephen C. Stearns, Jacobus J. Boomsma

BACKGROUND: Surgical removal of the adenoids and tonsils are common pediatric procedures, with conventional wisdom suggesting their absence has little impact on health or disease. However, little is known about long-term health consequences beyond the perioperative risks. Such ignorance is significant, for these lymphatic organs play important roles in both the development and the function of the immune system. METHODS: We tested the long-term consequences of surgery in the population of Denmark by examining risk for 28 diseases with ~1 million individuals followed from birth up to 30 years of age depending on whether any of three common surgeries (adenoidectomy, tonsillectomy, adenotonsillectomy) occurred in the first 9 years of life. To weigh costs and benefits, we also compared the absolute risks for these diseases to the risks for the conditions that these surgeries aimed to treat. We obtained robust results by using stratified Cox regressions with statistically well-powered samples of cases (with surgery) and controls (without surgery) whose general health was no different prior to surgery. We adjusted our estimates of risk for diseases occurring before surgery, stratified for sex (and other effects) and for 18 covariates, including parental disease history and birth metrics. RESULTS: We found significantly elevated relative risks for many diseases, with effects on respiratory, allergic and infectious disorders after removal of adenoids and tonsils being most pronounced. For some of these diseases, absolute risk increases were considerable. In comparison, many risks for conditions that surgeries aimed to treat were either not significantly different or significantly higher following surgery up to 30 years of age. This suggests that any immediate benefits of these surgeries may not continue longer-term, while resulting in slightly compromised early adult health due to significantly increased risk of many non-target diseases. CONCLUSIONS: Our results indicate that surgical removal of tonsils and adenoids early in life are associated with longer-term health risks. They underline the importance of these organs and tissues for normal immune functioning and early immune development, and suggest that these longer-term disease risks may outweigh the short-term benefits of these surgeries.

4: Projected spread of Zika virus in the Americas
more details view paper

Posted to bioRxiv 28 Jul 2016

Projected spread of Zika virus in the Americas
3,788 downloads epidemiology

Qian Zhang, Kaiyuan Sun, Matteo Chinazzi, Ana Pastore-Piontti, Natalie E. Dean, Diana P Rojas, Stefano Merler, Dina Mistry, Piero Poletti, Luca Rossi, Margaret Bray, M Elizabeth Halloran, Ira M Longini, Alessandro Vespignani

We use a data-driven global stochastic epidemic model to project past and future spread of the Zika virus (ZIKV) in the Americas. The model has high spatial and temporal resolution, and integrates real-world demographic, human mobility, socioeconomic, temperature, and vector density data. We estimate that the first introduction of ZIKV to Brazil likely occurred between August 2013 and April 2014 (90% credible interval). We provide simulated epidemic profiles of incident ZIKV infections for several countries in the Americas through February 2017. The ZIKV epidemic is characterized by slow growth and high spatial and seasonal heterogeneity, attributable to the dynamics of the mosquito vector and to the characteristics and mobility of the human populations. We project the expected timing and number of pregnancies infected with ZIKV during the first trimester, and provide estimates of microcephaly cases assuming different levels of risk as reported in empirical retrospective studies. Our approach represents an early modeling effort aimed at projecting the potential magnitude and timing of the ZIKV epidemic that might be refined as new and more accurate data from the region become available.

5: MR-Base: a platform for systematic causal inference across the phenome using billions of genetic associations
more details view paper

Posted to bioRxiv 16 Dec 2016

MR-Base: a platform for systematic causal inference across the phenome using billions of genetic associations
3,361 downloads epidemiology

Gibran Hemani, Jie Zheng, Kaitlin H Wade, Charles Laurin, Benjamin Elsworth, Stephen Burgess, Jack Bowden, Ryan Langdon, Vanessa Tan, James Yarmolinsky, Hashem A. Shihab, Nicholas Timpson, David M Evans, Caroline Relton, Richard M Martin, George Davey Smith, Tom R Gaunt, Philip C. Haycock, The MR-Base Collaboration

Published genetic associations can be used to infer causal relationships between phenotypes, bypassing the need for individual-level genotype or phenotype data. We have curated complete summary data from 1094 genome-wide association studies (GWAS) on diseases and other complex traits into a centralised database, and developed an analytical platform that uses these data to perform Mendelian randomization (MR) tests and sensitivity analyses (MR-Base, http://www.mrbase.org). Combined with curated data of published GWAS hits for phenomic measures, the MR-Base platform enables millions of potential causal relationships to be evaluated. We use the platform to predict the impact of lipid lowering on human health. While our analysis provides evidence that reducing LDL-cholesterol, lipoprotein(a) or triglyceride levels reduce coronary disease risk, it also suggests causal effects on a number of other non-vascular outcomes, indicating potential for adverse-effects or drug repositioning of lipid-lowering therapies.

6: Genomic and epidemiological monitoring of yellow fever virus transmission potential
more details view paper

Posted to bioRxiv 16 Apr 2018

Genomic and epidemiological monitoring of yellow fever virus transmission potential
2,556 downloads epidemiology

Nuno R. Faria, Kraemer M. U. G., Hill S. C., Goes de Jesus J., de Aguiar R. S., Iani F. C. M., Xavier J., Quick J., du Plessis L., Dellicour S., Thézé J., Carvalho R. D. O., Baele G., Wu C.-H., Silveira P. P., Arruda M. B., Pereira M. A., Pereira G. C., Lourenço J., Obolski U., Abade L., Vasylyeva T. I., Giovanetti M., Yi D., Weiss D.J., Wint G. R. W., Shearer F. M., Funk S., Nikolai B., Adelino T. E. R., Oliveira M. A. A., Silva M. V. F., Sacchetto L., Figueiredo P. O., Rezende I. M., Mello E. M., Said R. F. C., Santos D. A., Ferraz M. L., Brito M. G., Santana L. F., Menezes M. T., Brindeiro R. M., Tanuri A., dos Santos F. C. P., Cunha M. S., Nogueira J. S., Rocco I. M., da Costa A. C., Komninakis S. C. V., Azevedo V., Chieppe A. O., Araujo E. S. M., Mendonça M. C. L., dos Santos C. C., dos Santos C. D., Mares-Guia A. M., Nogueira R. M. R., Sequeira P. C., Abreu R. G., Garcia M. H. O., Alves R. V., Abreu A. L., Okumoto O., Kroon E. G., de Albuquerque C. F. C., Lewandowski K., Pullan S. T., Carroll M., Sabino E. C., Souza R. P., Suchard M. A., Lemey P., Trindade G. S., Drumond B. P., Filippis A. M. B., Loman N. J., Cauchemez S., Alcantara L. C. J., Pybus O. G.

The yellow fever virus (YFV) epidemic that began in Dec 2016 in Brazil is the largest in decades. The recent discovery of YFV in Brazilian Aedes sp. vectors highlights the urgent need to monitor the risk of re-establishment of domestic YFV transmission in the Americas. We use a suite of epidemiological, spatial and genomic approaches to characterize YFV transmission. We show that the age- and sex-distribution of human cases in Brazil is characteristic of sylvatic transmission. Analysis of YFV cases combined with genomes generated locally using a new protocol reveals an early phase of sylvatic YFV transmission restricted to Minas Gerais, followed in late 2016 by a rise in viral spillover to humans, and the southwards spatial expansion of the epidemic towards previously YFV-free areas. Our results establish a framework for monitoring YFV transmission in real-time, contributing to the global strategy of eliminating future yellow fever epidemics.

7: Stacked Generalization: An Introduction to Super Learning
more details view paper

Posted to bioRxiv 18 Aug 2017

Stacked Generalization: An Introduction to Super Learning
2,402 downloads epidemiology

Ashley I. Naimi, Laura B. Balzer

Stacked generalization is an ensemble method that allows researchers to combine several different prediction algorithms into one. Since its introduction in the early 1990s, the method has evolved several times into what is now known as "Super Learner". Super Learner uses V-fold cross-validation to build the optimal weighted combination of predictions from a library of candidate algorithms. Optimality is defined by a user-specified objective function, such as minimizing mean squared error or maximizing the area under the receiver operating characteristic curve. Although relatively simple in nature, use of the Super Learner by epidemiologists has been hampered by limitations in understanding conceptual and technical details. We work step-by-step through two examples to illustrate concepts and address common concerns.

8: PHESANT: a tool for performing automated phenome scans in UK Biobank
more details view paper

Posted to bioRxiv 26 Feb 2017

PHESANT: a tool for performing automated phenome scans in UK Biobank
2,385 downloads epidemiology

Louise A C Millard, Neil M Davies, Tom R Gaunt, George Davey Smith, Kate Tilling

Motivation: Epidemiological cohorts typically contain a diverse set of phenotypes such that automation of phenome scans is non-trivial, because they require highly heterogeneous models. For this reason, phenome scans have to date tended to use a smaller homogeneous set of phenotypes that can be analysed in a consistent fashion. We present PHESANT (PHEnome Scan ANalysis Tool), a software package for performing comprehensive phenome scans in UK Biobank. General features: PHESANT tests the association of a specified trait with all continuous, integer and categorical variables in UK Biobank, or a specified subset. PHESANT uses a novel rule-based algorithm to determine how to appropriately test each trait, then performs the analyses and produces plots and summary tables. Implementation: The PHESANT phenome scan is implemented in R. PHESANT includes a novel Javascript D3.js visualization, and accompanying Java code that converts the phenome scan results to the required JavaScript Object Notation (JSON) format. Availability: PHESANT is available on GitHub at [https://github.com/MRCIEU/PHESANT]. Git tag v0.2 corresponds to the version presented here.

9: An epigenetic biomarker of aging for lifespan and healthspan
more details view paper

Posted to bioRxiv 05 Mar 2018

An epigenetic biomarker of aging for lifespan and healthspan
2,341 downloads epidemiology

Morgan E Levine, Ake T Lu, Austin Quach, Brian H. Chen, Themistocles L Assimes, Stefania Bandinelli, Lifang Hou, Andrea A Baccarelli, James D Stewart, Yun Li, Eric A Whitsel, James G Wilson, Alex P Reiner, Abraham Aviv, Kurt Lohman, Yongmei Liu, Luigi Ferrucci, Steve Horvath

Identifying reliable biomarkers of aging is a major goal in geroscience. While the first generation of epigenetic biomarkers of aging were developed using chronological age as a surrogate for biological age, we hypothesized that incorporation of composite clinical measures of phenotypic age that capture differences in lifespan and healthspan may identify novel CpGs and facilitate the development of a more powerful epigenetic biomarker of aging. Using a innovative two-step process, we develop a new epigenetic biomarker of aging, DNAm PhenoAge, that strongly outperforms previous measures in regards to predictions for a variety of aging outcomes, including all-cause mortality, cancers, healthspan, physical functioning, and Alzheimer's disease. While this biomarker was developed using data from whole blood, it correlates strongly with age in every tissue and cell tested. Based on an in-depth transcriptional analysis in sorted cells, we find that increased epigenetic, relative to chronological age, is associated increased activation of pro-inflammatory and interferon pathways, and decreased activation of transcriptional/translational machinery, DNA damage response, and mitochondrial signatures. Overall, this single epigenetic biomarker of aging is able to capture risks for an array of diverse outcomes across multiple tissues and cells, and provide insight into important pathways in aging.

10: Automating Mendelian randomization through machine learning to construct a putative causal map of the human phenome
more details view paper

Posted to bioRxiv 10 Aug 2017

Automating Mendelian randomization through machine learning to construct a putative causal map of the human phenome
2,318 downloads epidemiology

Gibran Hemani, Jack Bowden, Philip C. Haycock, Jie Zheng, Oliver Davis, Peter Flach, Tom Gaunt, George Davey Smith

A major application for genome-wide association studies (GWAS) has been the emerging field of causal inference using Mendelian randomization (MR), where the causal effect between a pair of traits can be estimated using only summary level data. MR depends on SNPs exhibiting vertical pleiotropy, where the SNP influences an outcome phenotype only through an exposure phenotype. Issues arise when this assumption is violated due to SNPs exhibiting horizontal pleiotropy. We demonstrate that across a range of pleiotropy models, instrument selection will be increasingly liable to selecting invalid instruments as GWAS sample sizes continue to grow. Methods have been developed in an attempt to protect MR from different patterns of horizontal pleiotropy, and here we have designed a mixture-of-experts machine learning framework (MR-MoE 1.0) that predicts the most appropriate model to use for any specific causal analysis, improving on both power and false discovery rates. Using the approach, we systematically estimated the causal effects amongst 2407 phenotypes. Almost 90% of causal estimates indicated some level of horizontal pleiotropy. The causal estimates are organised into a publicly available graph database (http://eve.mrbase.org), and we use it here to highlight the numerous challenges that remain in automated causal inference.

11: Magnitude of road traffic accident related injuries and fatalities in Ethiopia
more details view paper

Posted to bioRxiv 01 Aug 2018

Magnitude of road traffic accident related injuries and fatalities in Ethiopia
2,316 downloads epidemiology

Teferi Abegaz, Samson Gebremedhin

Background: In many developing countries there is paucity of evidence regarding the epidemiology of road traffic accidents (RTAs). The study determines the rates of injuries and fatalities associated with RTAs in Ethiopia based on the data of a recent national survey. Methods: The study is based on the secondary data of the Ethiopian Demographic and Health Survey conducted in 2016. The survey collected information about occurrence injuries and accidents including RTAs in the past 12 months among 75,271 members of 16,650 households. Households were selected from nine regions and two city administrations of Ethiopia using stratified cluster sampling procedure. Results: Of the 75,271 household members enumerated, 123 encountered RTAs in the reference period and rate of RTA-related injury was 163 (95% confidence interval (CI): 136-195) per 100,000 population. Of the 123 causalities, 28 were fatal, making the fatality rate 37 (95% CI: 25-54) per 100,000 population. The RTA-related injuries and fatalities per 100,000 motor vehicles were estimated as 21,681 (95% CI: 18,090-25,938) and 4,922 (95% CI: 3325-7183), respectively. Next to accidental falls, RTAs were the second most common form of accidents and injuries accounting for 22.8% of all such incidents. RTAs contributed to 43.8% of all fatalities secondary to accidents and injuries. Among RTA causalities, 21.9% were drivers, 35.0% were passenger vehicle occupants and 36.0% were vulnerable road users including: motorcyclists (21.0%), pedestrians (12.1%) and cyclists (2.9%). Approximately half (47.1%) of the causalities were between 15-29 years of age and 15.3% were either minors younger than 15 years or seniors older than 64 years of age. Nearly two-thirds (65.0%) of the victims were males. Conclusion: RTA-related causalities are extremely high in Ethiopia. Male young adults and vulnerable road users are at increased risk of RTAs. There is a urgent need for bringing road safety to the country's public health agenda.

12: Machine learning models in electronic health records can outperform conventional survival models for predicting patient mortality in coronary artery disease
more details view paper

Posted to bioRxiv 30 Jan 2018

Machine learning models in electronic health records can outperform conventional survival models for predicting patient mortality in coronary artery disease
2,171 downloads epidemiology

Andrew J. Steele, S. Aylin Cakiroglu, Anoop D. Shah, Spiros Denaxas, Harry Hemingway, Nicholas Luscombe

Prognostic modelling is important in clinical practice and epidemiology for patient management and research. Electronic health records (EHR) provide large quantities of data for such models, but conventional epidemiological approaches require significant researcher time to implement. Expert selection of variables, fine-tuning of variable transformations and interactions, and imputing missing values in datasets are time-consuming and could bias subsequent analysis, particularly given that missingness in EHR is both high, and may carry meaning. Using a cohort of over 80,000 patients from the CALIBER programme, we performed a systematic comparison of several machine-learning approaches in EHR. We used Cox models and random survival forests with and without imputation on 27 expert-selected variables to predict all-cause mortality. We also used Cox models, random forests and elastic net regression on an extended dataset with 586 variables to build prognostic models and identify novel prognostic factors without prior expert input. We observed that data-driven models used on an extended dataset can outperform conventional models for prognosis, without data preprocessing or imputing missing values, and with no need to scale or transform continuous data. An elastic net Cox regression based with 586 unimputed variables with continuous values discretised achieved a C-index of 0.801 (bootstrapped 95% CI 0.799 to 0.802), compared to 0.793 (0.791 to 0.794) for a traditional Cox model comprising 27 expert-selected variables with imputation for missing values. We also found that data-driven models allow identification of novel prognostic variables; that the absence of values for particular variables carries meaning, and can have significant implications for prognosis; and that variables often have a nonlinear association with mortality, which discretised Cox models and random forests can elucidate. This demonstrates that machine-learning approaches applied to raw EHR data can be used to build reliable models for use in research and clinical practice, and identify novel predictive variables and their effects to inform future research.

13: Effect modification of FADS2 polymorphisms on the association between breastfeeding and intelligence: results from a collaborative meta-analysis
more details view paper

Posted to bioRxiv 07 Sep 2017

Effect modification of FADS2 polymorphisms on the association between breastfeeding and intelligence: results from a collaborative meta-analysis
2,143 downloads epidemiology

Fernando Pires Hartwig, Neil Martin Davies, Bernardo Lessa Horta, Tarunveer S. Ahluwalia, Hans Bisgaard, Klaus Bønnelykke, Avshalom Caspi, Terrie E Moffitt, Richie Poulton, Ayesha Sajjad, Henning W. Tiemeier, Albert Dalmau Bueno, Mònica Guxens, Mariona Bustamante Pineda, Loreto Santa-Marina, Nadine Parker, Tomáš Paus, Zdenka Pausova, Lotte Lauritzen, Theresia M. Schnurr, Kim F. Michaelsen, Torben Hansen, Wendy Oddy, Craig E. Pennell, Nicole M. Warrington, George Davey Smith, Cesar Gomes Victora

Background: Accumulating evidence suggests that breastfeeding benefits the children's intelligence. Long-chain polyunsaturated fatty acids (LC-PUFAs) present in breast milk may explain part of this association. Under a nutritional adequacy hypothesis, an interaction between breastfeeding and genetic variants associated with endogenous LC-PUFAs synthesis might be expected. However, the literature on this topic is controversial. Methods and Findings: We investigated this GenexEnvironment interaction in a de novo meta-analysis involving >12,000 individuals in the primary analysis, and >45,000 individuals in a secondary analysis using relaxed inclusion criteria. Our primary analysis used ever breastfeeding, FADS2 polymorphisms rs174575 and rs1535 coded assuming a recessive effect of the G allele, and intelligence quotient (IQ) in Z scores. Using random effects meta-analysis, ever breastfeeding was associated with 0.17 (95% CI: 0.03; 0.32) higher Z scores in IQ, or about 2.1 points. There was no strong evidence of interaction, with pooled covariate-adjusted interaction coefficients (i.e., difference between genetic groups of the difference in IQ Z scores comparing ever with never breastfed individuals) of 0.12 (95% CI: -0.19; 0.43) and 0.06 (95% CI: -0.16; 0.27) for the rs174575 and rs1535 variants, respectively. Secondary analyses corroborated these results. In studies with ≥5.85 and <5.85 months of breastfeeding duration, pooled estimates for the rs174575 variant were 0.50 (95% CI: -0.06; 1.06) and 0.14 (95% CI: -0.10; 0.38), respectively, and 0.27 (95% CI: -0.28; 0.82) and -0.01 (95% CI: -0.19; 0.16) for the rs1535 variant. However, between-group comparisons were underpowered. Conclusions: Our findings do not support an interaction between ever breastfeeding and FADS2 polymorphisms. However, our subgroup analysis raises the possibility that breastfeeding supplies LC-PUFAs requirements for cognitive development (if such threshold exists) if it lasts for some (currently unknown) time. Future studies in large individual-level datasets would allow properly powered subgroup analyses and would improve our understanding on the role of breastfeeding duration in the breastfeedingxFADS2 interaction.

14: Transfer entropy as a tool for inferring causality from observational studies in epidemiology
more details view paper

Posted to bioRxiv 14 Jun 2017

Transfer entropy as a tool for inferring causality from observational studies in epidemiology
2,141 downloads epidemiology

Nasir Ahmad Aziz

Recently Wiener's causality theorem, which states that one variable could be regarded as the cause of another if the ability to predict the future of the second variable is enhanced by implementing information about the preceding values of the first variable, was linked to information theory through the development of a novel metric called “transfer entropy”. Intuitively, transfer entropy can be conceptualized as a model-free measure of directed information flow from one variable to another. In contrast, directionality of information flow is not reflected in traditional measures of association which are completely symmetric by design. Although information theoretic approaches have been applied before in epidemiology, their value for inferring causality from observational studies is still unknown. Therefore, in the present study we use a set of simulation experiments, reflecting the most classical and widely used epidemiological observational study design, to validate the application of transfer entropy in epidemiological research. Moreover, we illustrate the practical applicability of this information theoretic approach to real-world epidemiological data by demonstrating that transfer entropy is able to extract the correct direction of information flow from longitudinal data concerning two well-known associations, i.e. that between smoking and lung cancer and that between obesity and diabetes risk. In conclusion, our results provide proof-of-concept that the recently developed transfer entropy method could be a welcome addition to the epidemiological armamentarium, especially to dissect those situations in which there is a well-described association between two variables but no clear-cut inclination as to the directionality of the association.

15: Collider Scope: When selection bias can substantially influence observed associations
more details view paper

Posted to bioRxiv 07 Oct 2016

Collider Scope: When selection bias can substantially influence observed associations
2,140 downloads epidemiology

Marcus R. Munafò, Kate Tilling, Amy E Taylor, David M Evans, George Davey Smith

Large-scale cross-sectional and cohort studies have transformed our understanding of the genetic and environmental determinants of health outcomes. However, the representativeness of these samples may be limited - either through selection into studies, or by attrition from studies over time. Here we explore the potential impact of this selection bias on results obtained from these studies, from the perspective that this amounts to conditioning on a collider (i.e., a form of collider bias). While it is acknowledged that selection bias will have a strong effect on representativeness and prevalence estimates, it is often assumed that it should not have a strong impact on estimates of associations. We argue that because selection can induce collider bias (which occurs when two variables independently influence a third variable, and that third variable is conditioned upon), selection can lead to substantially biased estimates of associations. In particular, selection related to phenotypes can bias associations with genetic variants associated with those phenotypes. In simulations, we show that even modest influences on selection into, or attrition from, a study can generate biased and potentially misleading estimates of both phenotypic and genotypic associations. Our results highlight the value of knowing which population your study sample is representative of. If the factors influencing selection and attrition are known, they can be adjusted for. For example, having DNA available on most participants in a birth cohort study offers the possibility of investigating the extent to which polygenic scores predict subsequent participation, which in turn would enable sensitivity analyses of the extent to which bias might distort estimates.

16: Virus genomes reveal the factors that spread and sustained the West African Ebola epidemic.
more details view paper

Posted to bioRxiv 02 Sep 2016

Virus genomes reveal the factors that spread and sustained the West African Ebola epidemic.
2,006 downloads epidemiology

Gytis Dudas, Luiz Max Carvalho, Trevor Bedford, Andrew J. Tatem, Guy Baele, Nuno Faria, Daniel J Park, Jason Ladner, Armando Arias, Danny Asogun, Filip Bielejec, Sarah Caddy, Matt Cotten, Jonathan Dambrozio, Simon Dellicour, Antonino Di Caro, Joseph W. Diclaro, Sophie Duraffour, Mike Elmore, Lawrence Fakoli, Merle Gilbert, Sahr M Gevao, Stephen Gire, Adrianne Gladden-Young, Andreas Gnirke, Augustine Goba, Donald S. Grant, Bart Haagmans, Julian A Hiscox, Umaru Jah, Brima Kargbo, Jeffrey Kugelman, Di Liu, Jia Lu, Christine M. Malboeuf, Suzanne Mate, David A. Matthews, Christian B Matranga, Luke Meredith, James Qu, Joshua Quick, Susan D. Pas, My V.T. Phan, Georgios Poliakis, Chantal Reusken, Mariano Sanchez-Lockhart, Stephen F Schaffner, John S. Schieffelin, Rachel S. Sealfon, Etienne Simon-Loriere, Saskia L. Smits, Kilian Stoecker, Lucy Thorne, Ekaete A. Tobin, Mohamed A. Vandi, Simon J. Watson, Kendra West, Shannon Whitmer, Michael R Wiley, Sarah M Winnicki, Shirlee Wohl, Roman Wölfel, Nathan L Yozwiak, Kristian G Andersen, Sylvia Blyden, Fatorma Bolay, Miles Carroll, Bernice Dahn, Boubacar Diallo, Pierre Formenty, Christophe Fraser, George F. Gao, Robert F Garry, Ian Goodfellow, Stephan Günther, Christian Happi, Edward C Holmes

The 2013-2016 epidemic of Ebola virus disease in West Africa was of unprecedented magnitude, duration and impact. Extensive collaborative sequencing projects have produced a large collection of over 1600 Ebola virus genomes, representing over 5% of known cases, unmatched for any single human epidemic. In this comprehensive analysis of this entire dataset, we reconstruct in detail the history of migration, proliferation and decline of Ebola virus throughout the region. We test the association of geography, climate, administrative boundaries, demography and culture with viral movement among 56 administrative regions. Our results show that during the outbreak viral lineages moved according to a classic 'gravity' model, with more intense migration between larger and more proximate population centers. Notably, we find that despite a strong attenuation of international dispersal after border closures, localized cross-border transmission beforehand had already set the seeds for an international epidemic, rendering these measures relatively ineffective in curbing the epidemic. We use this empirical evidence to address why the epidemic did not spread into neighboring countries, showing that although these regions were susceptible to developing significant outbreaks, they were also at lower risk of viral introductions. Finally, viral genome sequence data uniquely reveals this large epidemic to be a heterogeneous and spatially dissociated collection of transmission clusters of varying size, duration and connectivity. These insights will help inform approaches to intervention in such epidemics in the future.

17: Cannabis use and risk of schizophrenia: a Mendelian randomization study
more details view paper

Posted to bioRxiv 07 Dec 2016

Cannabis use and risk of schizophrenia: a Mendelian randomization study
1,994 downloads epidemiology

Julien Vaucher, Brendan J Keating, Aurélie M. Lasserre, Wei Gan, Donald M Lyall, Joey Ward, Daniel J Smith, Jill P Pell, Naveed Sattar, Guillaume Paré, Michael V Holmes

Cannabis use is observationally associated with an increased risk of schizophrenia, however whether the relationship is causal is not known. To determine the nature of the association between cannabis use on risk of schizophrenia using Mendelian randomization (MR) analysis, we used ten genetic variants previously identified to associate with cannabis use in 32,330 individuals. Genetic variants were used in a MR analyses of the association of genetically determined cannabis on risk of schizophrenia in 34,241 cases and 45,604 controls from predominantly European descent. Estimates from MR were compared to a metaanalysis of observational studies reporting effect estimates for ever use of cannabis and risk of schizophrenia or related disorders. Genetically determined use of cannabis was associated with increased risk of schizophrenia (OR of schizophrenia for users vs. non-users of cannabis: 1.37; 95%CI, 1.09 to 1.67; P-value=0.007). The corresponding estimate from observational analysis was 1.50 (95% CI, 1.10 to 2.00; P-value for heterogeneity = 0.88). The genetic instrument did not show evidence of pleiotropy on MR-Egger (Egger test, P-value=0.292) nor on multivariable MR accounting for tobacco exposure (OR of schizophrenia for users vs. nonusers of cannabis, adjusted for ever vs. never smoker: 1.41; 95% CI, 1.09-1.83). Furthermore, the causal estimate remained robust to sensitivity analyses. These findings strongly support a causal association between genetically determined use of cannabis and risk of schizophrenia. Such robust evidence may inform public health message about the risks of cannabis use, especially regarding its potential mental health consequences.

18: Assessment of Menstrual Hygiene Management and Its Determinants among Adolescent Girls: A Cross-Sectional Study in School adolescent girls in Addis Ababa, Ethiopia.
more details view paper

Posted to bioRxiv 22 Oct 2018

Assessment of Menstrual Hygiene Management and Its Determinants among Adolescent Girls: A Cross-Sectional Study in School adolescent girls in Addis Ababa, Ethiopia.
1,888 downloads epidemiology

Ephrem Biruk Gashaw, Ephrem Biruk, Worku Tefera, Nardos Tadesse, Ashagre Sisay

Managing menstruation is essentially dealing with menstrual flow and also in continuing regular activities like going to school, working etc. However, menstruation can place significant obstacles in girls’ access to health, education and future prospects if they are not equipped for effective menstrual hygiene management. The objective of this study was to assess the menstrual hygiene management and its determinant among school girls in Addis Ababa, Ethiopia. Cross-sectional study design with quantitative method was carried out among 770 systematically selected adolescent school girls of Addis Ababa from April 1 to May 5, 2017. A self-administered pre-test close ended Amharic questionnaire at school setting was used for data collection. The coding was done using the original English version and entered to EPI-7 software. The quantitative file exported to statistical package for social science (SPSS) version 25.0 software for analysis. Total mean score was used to categorize individuals as good and poor while AOR; 95% CI with p < 0.05 was used to determine factors of menstrual hygiene management practice. This study had 98% response rate. 530 (70.1%) and 388(51.3%) respondents had good knowledge and practice of menstrual hygiene respectively. The findings also showed a significant positive association between good knowledge of menstruation and girls from mother’s whose education were secondary (AOR = 10.012, 95 % CI = 3.628-27.629). Wealth index quantile five (AOR = 9.038, 95 % CI = 3.728-21.909) revealed significant positive association with good practice of menstrual hygiene. Majority of participants had good knowledge and practice of menstrual hygiene and majority of them were from private school. Although knowledge was better than practice, girls should be educated about the process, use of proper pads or absorbents and its proper disposal. Keywords: practices of menstrual hygiene, Menstrual knowledge, adolescent girl, Sanitary napkins, Menarche, school health.

19: A systematic review of Hepatitis B virus (HBV) drug and vaccine escape mutations in Africa: a call for urgent action
more details view paper

Posted to bioRxiv 07 Feb 2018

A systematic review of Hepatitis B virus (HBV) drug and vaccine escape mutations in Africa: a call for urgent action
1,811 downloads epidemiology

Jolynne Mokaya, Anna L McNaughton, Martin J Hadley, Apostolos Beloukas, Anna-Maria Geretti, Dominique Goedhals, Philippa C. Matthews

International sustainable development goals for the elimination of viral hepatitis as a public health problem by 2030 highlight the pressing need to optimize strategies for prevention, diagnosis and treatment. Selected or transmitted resistance associated mutations (RAMs) and vaccine escape mutations (VEMs) in hepatitis B virus (HBV) may reduce the success of existing treatment and prevention strategies. These issues are particularly pressing for many settings in Africa where there is high HBV prevalence and co-endemic HIV infection, but lack of robust epidemiological data and limited education, diagnostics and clinical care. The prevalence, distribution and impact of RAMs and VEMs in these populations are neglected in the current literature. We therefore set out to assimilate data for sub-Saharan Africa through a systematic literature review and analysis of published sequence data, and present these in an on-line database (https://livedataoxford.shinyapps.io/1510659619-3Xkoe2NKkKJ7Drg/). The majority of the data were from HIV/HBV coinfected cohorts. The commonest RAM was rtM204I/V, either alone or in combination with compensatory mutations, and reported in both treatment-naive and treatment-experienced adults. We also identified the suite of mutations rtM204V/I + rtL180M + rtV173L that is reported in association with vaccine escape, in over 1/3 of cohorts. Although tenofovir has a high genetic barrier to drug resistance, it is of concern that emerging data suggest polymorphisms that may be associated with resistance, although the precise impact of these is unknown. Overall, there is an urgent need for improved diagnostic screening, enhanced laboratory assessment of HBV before and during therapy, and sustained roll out of tenofovir in preference to lamividine. Further data are urgently needed in order to inform population and individual approaches to HBV diagnosis, monitoring and therapy in these highly vulnerable settings.

20: Zika virus infection as a cause of congenital brain abnormalities and Guillain-Barre syndrome: systematic review
more details view paper

Posted to bioRxiv 02 Sep 2016

Zika virus infection as a cause of congenital brain abnormalities and Guillain-Barre syndrome: systematic review
1,802 downloads epidemiology

Fabienne Krauer, Maurane Riesen, Ludovic Reveiz, Olufemi T Oladapo, Ruth Martinez-Vega, Teegwende V Porgo, Anina Haefliger, Nathalie J Broutet, Nicola Low, WHO Zika Causality Working Group

Background: The World Health Organization stated in March 2016 that there was scientific consensus that the mosquito-borne Zika virus was a cause of the neurological disorder Guillain-Barre syndrome and of microcephaly and other congenital brain abnormalities, based on rapid evidence assessments. Decisions about causality require systematic assessment to guide public health actions. The objectives of this study were: to update and re-assess the evidence for causality through a rapid and systematic review about links between Zika virus infection and a) congenital brain abnormalities, including microcephaly, in the foetuses and offspring of pregnant women and b) Guillain-Barre syndrome in any population; and to describe the process and outcomes of an expert assessment of the evidence about causality. Methods and findings: The study had three linked components. First, in February 2016, we developed a causality framework that defined questions about the relationship between Zika virus infection and each of the two clinical outcomes in 10 dimensions; temporality, biological plausibility, strength of association, alternative explanations, cessation, dose-response, animal experiments, analogy, specificity and consistency. Second, we did a systematic review (protocol number CRD42016036693). We searched multiple online sources up to May 30, 2016 to find studies that directly addressed either outcome and any causality dimension, used methods to expedite study selection, data extraction and quality assessment, and summarised evidence descriptively. Third, a multidisciplinary panel of experts assessed the review findings and reached consensus on causality. We found 1091 unique items up to May 30, 2016. For congenital brain abnormalities, including microcephaly, we included 72 items; for eight of 10 causality dimensions (all except dose-response relationship and specificity) we found that more than half the relevant studies supported a causal association with Zika virus infection. For Guillain-Barre syndrome, we included 36 items, of which more than half the relevant studies supported a causal association in seven of ten dimensions (all except dose-response relationship, specificity and animal experimental evidence). Articles identified non-systematically from May 30-July 29, 2016 strengthened the review findings. The expert panel concluded that: a) the most likely explanation of available evidence from outbreaks of Zika virus infection and clusters of microcephaly is that Zika virus infection during pregnancy is a cause of congenital brain abnormalities including microcephaly; and b) the most likely explanation of available evidence from outbreaks of Zika virus infection and Guillain-Barre syndrome is that Zika virus infection is a trigger of Guillain-Barre syndrome. The expert panel recognised that Zika virus alone may not be sufficient to cause either congenital brain abnormalities or Guillain-Barre syndrome but agreed that the evidence was sufficient to recommend increased public health measures. Weaknesses are the limited assessment of the role of dengue virus and other possible co-factors, the small number of comparative epidemiological studies, and the difficulty in keeping the review up to date with the pace of publication of new research. Conclusions: Rapid and systematic reviews with frequent updating and open dissemination are now needed, both for appraisal of the evidence about Zika virus infection and for the next public health threats that will emerge. This rapid systematic review found sufficient evidence to say that Zika virus is a cause of congenital abnormalities and is a trigger of Guillain-Barre situation.

Previous page 1 2 3 4 5 . . . 74 Next page

Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News