Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 57,915 bioRxiv papers from 266,490 authors.
Most downloaded bioRxiv papers, all time
in category epidemiology
1,553 results found. For more information, click each entry to expand.
6,850 downloads epidemiology
Background: A person's rate of aging has important implications for his/her risk of death and disease, thus, quantifying aging using observable characteristics has important applications for clinical, basic, and observational research. We aimed to validate a novel aging measure, 'Phenotypic Age', constructed based on routine clinical chemistry measures, by assessing its applicability for differentiating risk for morbidity and mortality in both healthy and unhealthy populations of various ages. Methods: A nationally representative US sample, NHANES III, was used to derive 'Phenotypic Age' based on a linear combination of chronological age and nine multi-system clinical chemistry measures, selected via cox proportional elastic net. Mortality predictions were validated using an independent sample (NHANES IV), consisting of 11,432 participants, for whom we observed a total of 871 deaths, ascertained over 12.6 year of follow-up. Proportional hazard models and ROC curves were used to evaluate predictions. Results: Phenotypic Age was significantly associated with all-cause mortality and cause-specific mortality. These results were robust to age and sex stratification, and remained even when excluding short-term mortality. Similarly, Phenotypic Age was associated with mortality among seemingly 'healthy' participants, defined as those who were disease-free and had normal BMI at baseline, as well as the oldest-old (aged 85+), a group with high disease burden. Conclusions: Phenotypic Age is a reliable predictor of all-cause and cause-specific mortality in multiple subgroups of the population. Risk stratification by this composite measure is far superior to that of the individual measures that go into it, as well as traditional measures of health. It is able to differentiate individuals who appear healthy, who may have otherwise been missed using traditional health assessments. Further, it can differentiate risk among persons with shared disease burden. Overall, this easily measured metric may be useful in the clinical setting and facilitate secondary and tertiary prevention strategies.
5,482 downloads epidemiology
Emma Ahlqvist, Petter Storm, Annemari Karajamaki, Mats Martinell, Mozhgan Dorkhan, Annelie Carlsson, Petter Vikman, Rashmi B. Prasad, Dina Mansour Aly, Peter Almgren, Ylva Wessman, Nael Shaat, Peter Spegel, Hindrik Mulder, Eero Lindholm, Olle Melander, Ola Hansson, Ulf Malmqvist, Ake Lernmark, Kaj Lahti, Tom Forsen, Tiinamaija Tuomi, Anders H. Rosengren, Leif Groop
Background: Diabetes is presently classified into two main forms, type 1 (T1D) and type 2 diabetes (T2D), but especially T2D is highly heterogeneous. A refined classification could provide a powerful tool individualize treatment regimes and identify individuals with increased risk of complications already at diagnosis. Methods: We applied data-driven cluster analysis (k-means and hierarchical clustering) in newly diagnosed diabetic patients (N=8,980) from the Swedish ANDIS (All New Diabetics in Scania) cohort, using five variables (GAD-antibodies, BMI, HbA1c, HOMA2-B and HOMA2-IR), and related to prospective data on development of complications and prescription of medication from patient records. Replication was performed in three independent cohorts: the Scania Diabetes Registry (SDR, N=1466), ANDIU (All New Diabetics in Uppsala, N=844) and DIREVA (Diabetes Registry Vaasa, N=3485). Cox regression and logistic regression was used to compare time to medication, time to reaching the treatment goal and risk of diabetic complications and genetic associations. Findings: We identified 5 replicable clusters of diabetes patients, with significantly different patient characteristics and risk of diabetic complications. Particularly, individuals in the most insulin-resistant cluster 3 had significantly higher risk of diabetic kidney disease, but had been prescribed similar diabetes treatment compared to the less susceptible individuals in clusters 4 and 5. The insulin deficient cluster 2 had the highest risk of retinopathy. In support of the clustering, genetic associations to the clusters differed from those seen in traditional T2D. Interpretation: We could stratify patients into five subgroups predicting disease progression and development of diabetic complications more precisely than the current classification. This new substratificationn may help to tailor and target early treatment to patients who would benefit most, thereby representing a first step towards precision medicine in diabetes.
4,722 downloads epidemiology
BACKGROUND: Surgical removal of the adenoids and tonsils are common pediatric procedures, with conventional wisdom suggesting their absence has little impact on health or disease. However, little is known about long-term health consequences beyond the perioperative risks. Such ignorance is significant, for these lymphatic organs play important roles in both the development and the function of the immune system. METHODS: We tested the long-term consequences of surgery in the population of Denmark by examining risk for 28 diseases with ~1 million individuals followed from birth up to 30 years of age depending on whether any of three common surgeries (adenoidectomy, tonsillectomy, adenotonsillectomy) occurred in the first 9 years of life. To weigh costs and benefits, we also compared the absolute risks for these diseases to the risks for the conditions that these surgeries aimed to treat. We obtained robust results by using stratified Cox regressions with statistically well-powered samples of cases (with surgery) and controls (without surgery) whose general health was no different prior to surgery. We adjusted our estimates of risk for diseases occurring before surgery, stratified for sex (and other effects) and for 18 covariates, including parental disease history and birth metrics. RESULTS: We found significantly elevated relative risks for many diseases, with effects on respiratory, allergic and infectious disorders after removal of adenoids and tonsils being most pronounced. For some of these diseases, absolute risk increases were considerable. In comparison, many risks for conditions that surgeries aimed to treat were either not significantly different or significantly higher following surgery up to 30 years of age. This suggests that any immediate benefits of these surgeries may not continue longer-term, while resulting in slightly compromised early adult health due to significantly increased risk of many non-target diseases. CONCLUSIONS: Our results indicate that surgical removal of tonsils and adenoids early in life are associated with longer-term health risks. They underline the importance of these organs and tissues for normal immune functioning and early immune development, and suggest that these longer-term disease risks may outweigh the short-term benefits of these surgeries.
3,824 downloads epidemiology
Qian Zhang, Kaiyuan Sun, Matteo Chinazzi, Ana Pastore-Piontti, Natalie E. Dean, Diana P Rojas, Stefano Merler, Dina Mistry, Piero Poletti, Luca Rossi, Margaret Bray, M Elizabeth Halloran, Ira M Longini, Alessandro Vespignani
We use a data-driven global stochastic epidemic model to project past and future spread of the Zika virus (ZIKV) in the Americas. The model has high spatial and temporal resolution, and integrates real-world demographic, human mobility, socioeconomic, temperature, and vector density data. We estimate that the first introduction of ZIKV to Brazil likely occurred between August 2013 and April 2014 (90% credible interval). We provide simulated epidemic profiles of incident ZIKV infections for several countries in the Americas through February 2017. The ZIKV epidemic is characterized by slow growth and high spatial and seasonal heterogeneity, attributable to the dynamics of the mosquito vector and to the characteristics and mobility of the human populations. We project the expected timing and number of pregnancies infected with ZIKV during the first trimester, and provide estimates of microcephaly cases assuming different levels of risk as reported in empirical retrospective studies. Our approach represents an early modeling effort aimed at projecting the potential magnitude and timing of the ZIKV epidemic that might be refined as new and more accurate data from the region become available.
3,433 downloads epidemiology
Gibran Hemani, Jie Zheng, Kaitlin H Wade, Charles Laurin, Benjamin Elsworth, Stephen Burgess, Jack Bowden, Ryan Langdon, Vanessa Tan, James Yarmolinsky, Hashem A. Shihab, Nicholas Timpson, David M Evans, Caroline Relton, Richard M Martin, George Davey Smith, Tom R Gaunt, Philip C. Haycock, The MR-Base Collaboration
Published genetic associations can be used to infer causal relationships between phenotypes, bypassing the need for individual-level genotype or phenotype data. We have curated complete summary data from 1094 genome-wide association studies (GWAS) on diseases and other complex traits into a centralised database, and developed an analytical platform that uses these data to perform Mendelian randomization (MR) tests and sensitivity analyses (MR-Base, http://www.mrbase.org). Combined with curated data of published GWAS hits for phenomic measures, the MR-Base platform enables millions of potential causal relationships to be evaluated. We use the platform to predict the impact of lipid lowering on human health. While our analysis provides evidence that reducing LDL-cholesterol, lipoprotein(a) or triglyceride levels reduce coronary disease risk, it also suggests causal effects on a number of other non-vascular outcomes, indicating potential for adverse-effects or drug repositioning of lipid-lowering therapies.
2,675 downloads epidemiology
Nuno R. Faria, Kraemer M. U. G., Hill S. C., Goes de Jesus J., de Aguiar R. S., Iani F. C. M., Xavier J., Quick J., du Plessis L., Dellicour S., Thézé J., Carvalho R. D. O., Baele G., Wu C.-H., Silveira P. P., Arruda M. B., Pereira M. A., Pereira G. C., Lourenço J., Obolski U., Abade L., Vasylyeva T. I., Giovanetti M., Yi D., Weiss D.J., Wint G. R. W., Shearer F. M., Funk S., Nikolai B., Adelino T. E. R., Oliveira M. A. A., Silva M. V. F., Sacchetto L., Figueiredo P. O., Rezende I. M., Mello E. M., Said R. F. C., Santos D. A., Ferraz M. L., Brito M. G., Santana L. F., Menezes M. T., Brindeiro R. M., Tanuri A., dos Santos F. C. P., Cunha M. S., Nogueira J. S., Rocco I. M., da Costa A. C., Komninakis S. C. V., Azevedo V., Chieppe A. O., Araujo E. S. M., Mendonça M. C. L., dos Santos C. C., dos Santos C. D., Mares-Guia A. M., Nogueira R. M. R., Sequeira P. C., Abreu R. G., Garcia M. H. O., Alves R. V., Abreu A. L., Okumoto O., Kroon E. G., de Albuquerque C. F. C., Lewandowski K., Pullan S. T., Carroll M., Sabino E. C., Souza R. P., Suchard M. A., Lemey P., Trindade G. S., Drumond B. P., Filippis A. M. B., Loman N. J., Cauchemez S., Alcantara L. C. J., Pybus O. G.
The yellow fever virus (YFV) epidemic that began in Dec 2016 in Brazil is the largest in decades. The recent discovery of YFV in Brazilian Aedes sp. vectors highlights the urgent need to monitor the risk of re-establishment of domestic YFV transmission in the Americas. We use a suite of epidemiological, spatial and genomic approaches to characterize YFV transmission. We show that the age- and sex-distribution of human cases in Brazil is characteristic of sylvatic transmission. Analysis of YFV cases combined with genomes generated locally using a new protocol reveals an early phase of sylvatic YFV transmission restricted to Minas Gerais, followed in late 2016 by a rise in viral spillover to humans, and the southwards spatial expansion of the epidemic towards previously YFV-free areas. Our results establish a framework for monitoring YFV transmission in real-time, contributing to the global strategy of eliminating future yellow fever epidemics.
2,539 downloads epidemiology
2,527 downloads epidemiology
Stacked generalization is an ensemble method that allows researchers to combine several different prediction algorithms into one. Since its introduction in the early 1990s, the method has evolved several times into what is now known as "Super Learner". Super Learner uses V-fold cross-validation to build the optimal weighted combination of predictions from a library of candidate algorithms. Optimality is defined by a user-specified objective function, such as minimizing mean squared error or maximizing the area under the receiver operating characteristic curve. Although relatively simple in nature, use of the Super Learner by epidemiologists has been hampered by limitations in understanding conceptual and technical details. We work step-by-step through two examples to illustrate concepts and address common concerns.
2,471 downloads epidemiology
Background: In many developing countries there is paucity of evidence regarding the epidemiology of road traffic accidents (RTAs). The study determines the rates of injuries and fatalities associated with RTAs in Ethiopia based on the data of a recent national survey. Methods: The study is based on the secondary data of the Ethiopian Demographic and Health Survey conducted in 2016. The survey collected information about occurrence injuries and accidents including RTAs in the past 12 months among 75,271 members of 16,650 households. Households were selected from nine regions and two city administrations of Ethiopia using stratified cluster sampling procedure. Results: Of the 75,271 household members enumerated, 123 encountered RTAs in the reference period and rate of RTA-related injury was 163 (95% confidence interval (CI): 136-195) per 100,000 population. Of the 123 causalities, 28 were fatal, making the fatality rate 37 (95% CI: 25-54) per 100,000 population. The RTA-related injuries and fatalities per 100,000 motor vehicles were estimated as 21,681 (95% CI: 18,090-25,938) and 4,922 (95% CI: 3325-7183), respectively. Next to accidental falls, RTAs were the second most common form of accidents and injuries accounting for 22.8% of all such incidents. RTAs contributed to 43.8% of all fatalities secondary to accidents and injuries. Among RTA causalities, 21.9% were drivers, 35.0% were passenger vehicle occupants and 36.0% were vulnerable road users including: motorcyclists (21.0%), pedestrians (12.1%) and cyclists (2.9%). Approximately half (47.1%) of the causalities were between 15-29 years of age and 15.3% were either minors younger than 15 years or seniors older than 64 years of age. Nearly two-thirds (65.0%) of the victims were males. Conclusion: RTA-related causalities are extremely high in Ethiopia. Male young adults and vulnerable road users are at increased risk of RTAs. There is a urgent need for bringing road safety to the country's public health agenda.
2,442 downloads epidemiology
A major application for genome-wide association studies (GWAS) has been the emerging field of causal inference using Mendelian randomization (MR), where the causal effect between a pair of traits can be estimated using only summary level data. MR depends on SNPs exhibiting vertical pleiotropy, where the SNP influences an outcome phenotype only through an exposure phenotype. Issues arise when this assumption is violated due to SNPs exhibiting horizontal pleiotropy. We demonstrate that across a range of pleiotropy models, instrument selection will be increasingly liable to selecting invalid instruments as GWAS sample sizes continue to grow. Methods have been developed in an attempt to protect MR from different patterns of horizontal pleiotropy, and here we have designed a mixture-of-experts machine learning framework (MR-MoE 1.0) that predicts the most appropriate model to use for any specific causal analysis, improving on both power and false discovery rates. Using the approach, we systematically estimated the causal effects amongst 2407 phenotypes. Almost 90% of causal estimates indicated some level of horizontal pleiotropy. The causal estimates are organised into a publicly available graph database (http://eve.mrbase.org), and we use it here to highlight the numerous challenges that remain in automated causal inference.
2,412 downloads epidemiology
Morgan E Levine, Ake T Lu, Austin Quach, Brian H. Chen, Themistocles L Assimes, Stefania Bandinelli, Lifang Hou, Andrea A Baccarelli, James D Stewart, Yun Li, Eric A Whitsel, James G Wilson, Alex P Reiner, Abraham Aviv, Kurt Lohman, Yongmei Liu, Luigi Ferrucci, Steve Horvath
Identifying reliable biomarkers of aging is a major goal in geroscience. While the first generation of epigenetic biomarkers of aging were developed using chronological age as a surrogate for biological age, we hypothesized that incorporation of composite clinical measures of phenotypic age that capture differences in lifespan and healthspan may identify novel CpGs and facilitate the development of a more powerful epigenetic biomarker of aging. Using a innovative two-step process, we develop a new epigenetic biomarker of aging, DNAm PhenoAge, that strongly outperforms previous measures in regards to predictions for a variety of aging outcomes, including all-cause mortality, cancers, healthspan, physical functioning, and Alzheimer's disease. While this biomarker was developed using data from whole blood, it correlates strongly with age in every tissue and cell tested. Based on an in-depth transcriptional analysis in sorted cells, we find that increased epigenetic, relative to chronological age, is associated increased activation of pro-inflammatory and interferon pathways, and decreased activation of transcriptional/translational machinery, DNA damage response, and mitochondrial signatures. Overall, this single epigenetic biomarker of aging is able to capture risks for an array of diverse outcomes across multiple tissues and cells, and provide insight into important pathways in aging.
2,350 downloads epidemiology
Recently Wiener's causality theorem, which states that one variable could be regarded as the cause of another if the ability to predict the future of the second variable is enhanced by implementing information about the preceding values of the first variable, was linked to information theory through the development of a novel metric called “transfer entropy”. Intuitively, transfer entropy can be conceptualized as a model-free measure of directed information flow from one variable to another. In contrast, directionality of information flow is not reflected in traditional measures of association which are completely symmetric by design. Although information theoretic approaches have been applied before in epidemiology, their value for inferring causality from observational studies is still unknown. Therefore, in the present study we use a set of simulation experiments, reflecting the most classical and widely used epidemiological observational study design, to validate the application of transfer entropy in epidemiological research. Moreover, we illustrate the practical applicability of this information theoretic approach to real-world epidemiological data by demonstrating that transfer entropy is able to extract the correct direction of information flow from longitudinal data concerning two well-known associations, i.e. that between smoking and lung cancer and that between obesity and diabetes risk. In conclusion, our results provide proof-of-concept that the recently developed transfer entropy method could be a welcome addition to the epidemiological armamentarium, especially to dissect those situations in which there is a well-described association between two variables but no clear-cut inclination as to the directionality of the association.
2,340 downloads epidemiology
Prognostic modelling is important in clinical practice and epidemiology for patient management and research. Electronic health records (EHR) provide large quantities of data for such models, but conventional epidemiological approaches require significant researcher time to implement. Expert selection of variables, fine-tuning of variable transformations and interactions, and imputing missing values in datasets are time-consuming and could bias subsequent analysis, particularly given that missingness in EHR is both high, and may carry meaning. Using a cohort of over 80,000 patients from the CALIBER programme, we performed a systematic comparison of several machine-learning approaches in EHR. We used Cox models and random survival forests with and without imputation on 27 expert-selected variables to predict all-cause mortality. We also used Cox models, random forests and elastic net regression on an extended dataset with 586 variables to build prognostic models and identify novel prognostic factors without prior expert input. We observed that data-driven models used on an extended dataset can outperform conventional models for prognosis, without data preprocessing or imputing missing values, and with no need to scale or transform continuous data. An elastic net Cox regression based with 586 unimputed variables with continuous values discretised achieved a C-index of 0.801 (bootstrapped 95% CI 0.799 to 0.802), compared to 0.793 (0.791 to 0.794) for a traditional Cox model comprising 27 expert-selected variables with imputation for missing values. We also found that data-driven models allow identification of novel prognostic variables; that the absence of values for particular variables carries meaning, and can have significant implications for prognosis; and that variables often have a nonlinear association with mortality, which discretised Cox models and random forests can elucidate. This demonstrates that machine-learning approaches applied to raw EHR data can be used to build reliable models for use in research and clinical practice, and identify novel predictive variables and their effects to inform future research.
2,194 downloads epidemiology
Large-scale cross-sectional and cohort studies have transformed our understanding of the genetic and environmental determinants of health outcomes. However, the representativeness of these samples may be limited - either through selection into studies, or by attrition from studies over time. Here we explore the potential impact of this selection bias on results obtained from these studies, from the perspective that this amounts to conditioning on a collider (i.e., a form of collider bias). While it is acknowledged that selection bias will have a strong effect on representativeness and prevalence estimates, it is often assumed that it should not have a strong impact on estimates of associations. We argue that because selection can induce collider bias (which occurs when two variables independently influence a third variable, and that third variable is conditioned upon), selection can lead to substantially biased estimates of associations. In particular, selection related to phenotypes can bias associations with genetic variants associated with those phenotypes. In simulations, we show that even modest influences on selection into, or attrition from, a study can generate biased and potentially misleading estimates of both phenotypic and genotypic associations. Our results highlight the value of knowing which population your study sample is representative of. If the factors influencing selection and attrition are known, they can be adjusted for. For example, having DNA available on most participants in a birth cohort study offers the possibility of investigating the extent to which polygenic scores predict subsequent participation, which in turn would enable sensitivity analyses of the extent to which bias might distort estimates.
2,162 downloads epidemiology
Fernando Pires Hartwig, Neil Martin Davies, Bernardo Lessa Horta, Tarunveer S. Ahluwalia, Hans Bisgaard, Klaus Bønnelykke, Avshalom Caspi, Terrie E Moffitt, Richie Poulton, Ayesha Sajjad, Henning W. Tiemeier, Albert Dalmau Bueno, Mònica Guxens, Mariona Bustamante Pineda, Loreto Santa-Marina, Nadine Parker, Tomáš Paus, Zdenka Pausova, Lotte Lauritzen, Theresia M. Schnurr, Kim F. Michaelsen, Torben Hansen, Wendy Oddy, Craig E. Pennell, Nicole M. Warrington, George Davey Smith, Cesar Gomes Victora
Background: Accumulating evidence suggests that breastfeeding benefits the children's intelligence. Long-chain polyunsaturated fatty acids (LC-PUFAs) present in breast milk may explain part of this association. Under a nutritional adequacy hypothesis, an interaction between breastfeeding and genetic variants associated with endogenous LC-PUFAs synthesis might be expected. However, the literature on this topic is controversial. Methods and Findings: We investigated this GenexEnvironment interaction in a de novo meta-analysis involving >12,000 individuals in the primary analysis, and >45,000 individuals in a secondary analysis using relaxed inclusion criteria. Our primary analysis used ever breastfeeding, FADS2 polymorphisms rs174575 and rs1535 coded assuming a recessive effect of the G allele, and intelligence quotient (IQ) in Z scores. Using random effects meta-analysis, ever breastfeeding was associated with 0.17 (95% CI: 0.03; 0.32) higher Z scores in IQ, or about 2.1 points. There was no strong evidence of interaction, with pooled covariate-adjusted interaction coefficients (i.e., difference between genetic groups of the difference in IQ Z scores comparing ever with never breastfed individuals) of 0.12 (95% CI: -0.19; 0.43) and 0.06 (95% CI: -0.16; 0.27) for the rs174575 and rs1535 variants, respectively. Secondary analyses corroborated these results. In studies with ≥5.85 and <5.85 months of breastfeeding duration, pooled estimates for the rs174575 variant were 0.50 (95% CI: -0.06; 1.06) and 0.14 (95% CI: -0.10; 0.38), respectively, and 0.27 (95% CI: -0.28; 0.82) and -0.01 (95% CI: -0.19; 0.16) for the rs1535 variant. However, between-group comparisons were underpowered. Conclusions: Our findings do not support an interaction between ever breastfeeding and FADS2 polymorphisms. However, our subgroup analysis raises the possibility that breastfeeding supplies LC-PUFAs requirements for cognitive development (if such threshold exists) if it lasts for some (currently unknown) time. Future studies in large individual-level datasets would allow properly powered subgroup analyses and would improve our understanding on the role of breastfeeding duration in the breastfeedingxFADS2 interaction.
2,135 downloads epidemiology
Managing menstruation is essentially dealing with menstrual flow and also in continuing regular activities like going to school, working etc. However, menstruation can place significant obstacles in girls’ access to health, education and future prospects if they are not equipped for effective menstrual hygiene management. The objective of this study was to assess the menstrual hygiene management and its determinant among school girls in Addis Ababa, Ethiopia. Cross-sectional study design with quantitative method was carried out among 770 systematically selected adolescent school girls of Addis Ababa from April 1 to May 5, 2017. A self-administered pre-test close ended Amharic questionnaire at school setting was used for data collection. The coding was done using the original English version and entered to EPI-7 software. The quantitative file exported to statistical package for social science (SPSS) version 25.0 software for analysis. Total mean score was used to categorize individuals as good and poor while AOR; 95% CI with p < 0.05 was used to determine factors of menstrual hygiene management practice. This study had 98% response rate. 530 (70.1%) and 388(51.3%) respondents had good knowledge and practice of menstrual hygiene respectively. The findings also showed a significant positive association between good knowledge of menstruation and girls from mother’s whose education were secondary (AOR = 10.012, 95 % CI = 3.628-27.629). Wealth index quantile five (AOR = 9.038, 95 % CI = 3.728-21.909) revealed significant positive association with good practice of menstrual hygiene. Majority of participants had good knowledge and practice of menstrual hygiene and majority of them were from private school. Although knowledge was better than practice, girls should be educated about the process, use of proper pads or absorbents and its proper disposal. Keywords: practices of menstrual hygiene, Menstrual knowledge, adolescent girl, Sanitary napkins, Menarche, school health.
2,061 downloads epidemiology
Gytis Dudas, Luiz Max Carvalho, Trevor Bedford, Andrew J. Tatem, Guy Baele, Nuno Faria, Daniel J Park, Jason Ladner, Armando Arias, Danny Asogun, Filip Bielejec, Sarah Caddy, Matt Cotten, Jonathan Dambrozio, Simon Dellicour, Antonino Di Caro, Joseph W. Diclaro, Sophie Duraffour, Mike Elmore, Lawrence Fakoli, Merle Gilbert, Sahr M Gevao, Stephen Gire, Adrianne Gladden-Young, Andreas Gnirke, Augustine Goba, Donald S. Grant, Bart Haagmans, Julian A Hiscox, Umaru Jah, Brima Kargbo, Jeffrey Kugelman, Di Liu, Jia Lu, Christine M. Malboeuf, Suzanne Mate, David A. Matthews, Christian B Matranga, Luke Meredith, James Qu, Joshua Quick, Susan D. Pas, My V.T. Phan, Georgios Poliakis, Chantal Reusken, Mariano Sanchez-Lockhart, Stephen F Schaffner, John S. Schieffelin, Rachel S. Sealfon, Etienne Simon-Loriere, Saskia L. Smits, Kilian Stoecker, Lucy Thorne, Ekaete A. Tobin, Mohamed A. Vandi, Simon J. Watson, Kendra West, Shannon Whitmer, Michael R Wiley, Sarah M Winnicki, Shirlee Wohl, Roman Wölfel, Nathan L Yozwiak, Kristian G Andersen, Sylvia Blyden, Fatorma Bolay, Miles Carroll, Bernice Dahn, Boubacar Diallo, Pierre Formenty, Christophe Fraser, George F. Gao, Robert F Garry, Ian Goodfellow, Stephan Günther, Christian Happi, Edward C Holmes
The 2013-2016 epidemic of Ebola virus disease in West Africa was of unprecedented magnitude, duration and impact. Extensive collaborative sequencing projects have produced a large collection of over 1600 Ebola virus genomes, representing over 5% of known cases, unmatched for any single human epidemic. In this comprehensive analysis of this entire dataset, we reconstruct in detail the history of migration, proliferation and decline of Ebola virus throughout the region. We test the association of geography, climate, administrative boundaries, demography and culture with viral movement among 56 administrative regions. Our results show that during the outbreak viral lineages moved according to a classic 'gravity' model, with more intense migration between larger and more proximate population centers. Notably, we find that despite a strong attenuation of international dispersal after border closures, localized cross-border transmission beforehand had already set the seeds for an international epidemic, rendering these measures relatively ineffective in curbing the epidemic. We use this empirical evidence to address why the epidemic did not spread into neighboring countries, showing that although these regions were susceptible to developing significant outbreaks, they were also at lower risk of viral introductions. Finally, viral genome sequence data uniquely reveals this large epidemic to be a heterogeneous and spatially dissociated collection of transmission clusters of varying size, duration and connectivity. These insights will help inform approaches to intervention in such epidemics in the future.
2,028 downloads epidemiology
Cannabis use is observationally associated with an increased risk of schizophrenia, however whether the relationship is causal is not known. To determine the nature of the association between cannabis use on risk of schizophrenia using Mendelian randomization (MR) analysis, we used ten genetic variants previously identified to associate with cannabis use in 32,330 individuals. Genetic variants were used in a MR analyses of the association of genetically determined cannabis on risk of schizophrenia in 34,241 cases and 45,604 controls from predominantly European descent. Estimates from MR were compared to a metaanalysis of observational studies reporting effect estimates for ever use of cannabis and risk of schizophrenia or related disorders. Genetically determined use of cannabis was associated with increased risk of schizophrenia (OR of schizophrenia for users vs. non-users of cannabis: 1.37; 95%CI, 1.09 to 1.67; P-value=0.007). The corresponding estimate from observational analysis was 1.50 (95% CI, 1.10 to 2.00; P-value for heterogeneity = 0.88). The genetic instrument did not show evidence of pleiotropy on MR-Egger (Egger test, P-value=0.292) nor on multivariable MR accounting for tobacco exposure (OR of schizophrenia for users vs. nonusers of cannabis, adjusted for ever vs. never smoker: 1.41; 95% CI, 1.09-1.83). Furthermore, the causal estimate remained robust to sensitivity analyses. These findings strongly support a causal association between genetically determined use of cannabis and risk of schizophrenia. Such robust evidence may inform public health message about the risks of cannabis use, especially regarding its potential mental health consequences.
2,012 downloads epidemiology
Bacillus anthracis is a spore-forming, Gram-positive bacterium responsible for anthrax, an acute and commonly lethal infection that most significantly affects grazing livestock, wild ungulates and other herbivorous mammals, but also poses a serious threat to human health. The geographic extent of B. anthracis endemism is still poorly understood, despite multi-decade research on anthrax epizootic and epidemic dynamics around the world. Several biogeographic studies have focused on modeling environmental suitability for anthrax at local or national scales, but many countries have limited or inadequate surveillance systems, even within known endemic regions. Here we compile an extensive global occurrence dataset for B. anthracis, drawing on confirmed human, livestock, and wildlife anthrax outbreaks. With these records, we use boosted regression trees to produce the first map of the global distribution of B. anthracis as a proxy for anthrax risk. Variable contributions to the model support pre-existing hypotheses that environmental suitability for B. anthracis depends most strongly on soil characteristics such as pH that affect spore persistence, and the extent of seasonal fluctuations in vegetation, which plays a key role in transmission for herbivores. We apply the global model to estimate that 1.83 billion people (95% credible interval: 0.59 - 4.16 billion) live within regions of anthrax risk, but most of that population faces little occupational exposure to anthrax. More informatively, a global total of 63.8 million rural poor livestock keepers (95% CI: 17.5 - 168.6 million) and 1.1 billion livestock (95% CI: 0.4 - 2.3 billion) live within vulnerable regions. Human risk is concentrated in rural areas, and human and livestock vulnerability are both concentrated in rainfed systems throughout arid and temperate land across Eurasia, Africa, and North America. We conclude by mapping where anthrax risk overlaps with vulnerable wild ungulate populations, and therefore could disrupt sensitive conservation efforts for species like bison, pronghorn, and saiga that coincide with anthrax-prone, mixed-agricultural landscapes.
1,953 downloads epidemiology
International sustainable development goals for the elimination of viral hepatitis as a public health problem by 2030 highlight the pressing need to optimize strategies for prevention, diagnosis and treatment. Selected or transmitted resistance associated mutations (RAMs) and vaccine escape mutations (VEMs) in hepatitis B virus (HBV) may reduce the success of existing treatment and prevention strategies. These issues are particularly pressing for many settings in Africa where there is high HBV prevalence and co-endemic HIV infection, but lack of robust epidemiological data and limited education, diagnostics and clinical care. The prevalence, distribution and impact of RAMs and VEMs in these populations are neglected in the current literature. We therefore set out to assimilate data for sub-Saharan Africa through a systematic literature review and analysis of published sequence data, and present these in an on-line database (https://livedataoxford.shinyapps.io/1510659619-3Xkoe2NKkKJ7Drg/). The majority of the data were from HIV/HBV coinfected cohorts. The commonest RAM was rtM204I/V, either alone or in combination with compensatory mutations, and reported in both treatment-naive and treatment-experienced adults. We also identified the suite of mutations rtM204V/I + rtL180M + rtV173L that is reported in association with vaccine escape, in over 1/3 of cohorts. Although tenofovir has a high genetic barrier to drug resistance, it is of concern that emerging data suggest polymorphisms that may be associated with resistance, although the precise impact of these is unknown. Overall, there is an urgent need for improved diagnostic screening, enhanced laboratory assessment of HBV before and during therapy, and sustained roll out of tenofovir in preference to lamividine. Further data are urgently needed in order to inform population and individual approaches to HBV diagnosis, monitoring and therapy in these highly vulnerable settings.
- Top preprints of 2018
- Paper search
- Author leaderboards
- Overall metrics
- The API
- Email newsletter
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!