Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 57,789 bioRxiv papers from 265,997 authors.
Most downloaded bioRxiv papers, since beginning of last month
56,389 results found. For more information, click each entry to expand.
394 downloads systems biology
In order to understand changes in gene expression that occur as a result of age, which might create a permissive or causal environment for age-related diseases, we produced a multi-timepoint Age-related Gene Expression Signature (AGES) from liver, kidney, skeletal muscle and hippocampus of rats, comparing 6, 9, 12, 18, 21, 24 and 27-month old animals. We focused on genes that changed in one direction throughout the lifespan of the animal, either early in life (early logistic changes); at mid-age (mid-logistic); late in life (late-logistic); or linearly, throughout the lifespan. The pathways perturbed as a result of chronological age demonstrate organ-specific and more global effects of aging, and point to mechanisms that might be counter-regulated pharmacologically in order to treat age-associated diseases. A small number of genes were regulated by aging in the same manner in every tissue, suggesting they may be more universal markers of aging.
394 downloads evolutionary biology
Previous genome-scale studies of populations living today in Ethiopia have found evidence of recent gene flow from an Eurasian source, dating to the last 3,000 years. Haplotype and genotype data based analyses of modern and ancient data (aDNA) have considered Sardinia-like proxy, broadly Levantine or Neolithic Levantine populations as a range of possible sources for this gene flow. Given the ancient nature of this gene flow and the extent of population movements and replacements that affected West Asia in the last 3000 years, aDNA evidence would seem as the best proxy for determining the putative population source. We demonstrate, however, that the deeply divergent, autochthonous African component which accounts for ~50% of most contemporary Ethiopian genomes, affects the overall allele frequency spectrum to an extent that makes it hard to control for it and, at once, to discern between subtly different, yet important, Eurasian sources (such as Anatolian or Levant Neolithic ones). Here we re-assess pattern of allele sharing between the Eurasian component of Ethiopians (here called NAF for Non African) and ancient and modern proxies area after having extracted NAF from Ethiopians through ancestry deconvolution, and unveil a genomic signature compatible with population movements that affected the Mediterranean area and the Levant after the fall of the Minoan civilization.
392 downloads biophysics
Transcription-factor (TF) proteins recognize specific genomic sequences, despite an overwhelming excess of non-specific DNA, to regulate complex gene expression programs. While there have been significant advances in understanding how DNA sequence and shape contribute to recognition, some fundamental aspects of protein-DNA binding remain poorly understood. Many DNA-binding proteins induce changes in the DNA structure outside the intrinsic B-DNA envelope. How the energetic cost associated with distorting DNA contributes to recognition has proven difficult to study and measure experimentally because the distorted DNA structures exist as low-abundance conformations in the naked B-DNA ensemble. Here, we use a novel high-throughput assay called SaMBA (Saturation Mismatch-Binding Assay) to investigate the role of DNA conformational penalties in TF-DNA recognition. The approach introduces mismatched base-pairs (i.e. mispairs) within TF binding sites to pre-induce a variety of DNA structural distortions much larger than those induced by changes in Watson-Crick sequence. Strikingly, while most mismatches either weakened TF binding (~70%) or had negligible effects (~20%), approximately 10% of mismatches increased binding and at least one mismatch was found that increased the binding affinity for each of 21 examined TFs. Mismatches also converted sites from the non-specific affinity range into specific sites, and high-affinity sites into 'super-sites' stronger than any known canonical binding site. These findings reveal a complex binding landscape that cannot be explained based on DNA sequence alone. Analysis of crystal structures together with NMR and molecular dynamics simulations revealed that many of the mismatches that increase binding induce distortions similar to those induced by TF binding, thus pre-paying some of the energetic cost to deform the DNA. Our work indicates that conformational penalties are a major determinant of protein-DNA recognition, and reveals mechanisms by which mismatches can recruit TFs and thus modulate replication and repair activities in the cell.
391 downloads neuroscience
The cortex sends a direct projection to the superior colliculus. What is largely unknown is whether (and if so how) the superior colliculus modulates activity in the cortex. Here, we directly investigate this issue, showing that optogenetic activation of superior colliculus changes the input-output relationship of neurons in somatosensory cortex during whisker movement, enhancing responses to low amplitude whisker deflections. While there is no direct pathway from superior colliculus to somatosensory cortex, we found that activation of superior colliculus drives spiking in the posterior medial (POm) nucleus of the thalamus via a powerful monosynaptic pathway. Furthermore, POm neurons receiving input from superior colliculus provide excitatory input to somatosensory cortex. Silencing POm abolished the capacity of superior colliculus to modulate cortical whisker responses. Our findings indicate that the superior colliculus, which plays a key role in attention, modulates sensory processing in somatosensory cortex via a powerful disynaptic pathway through the thalamus.
391 downloads neuroscience
Our nervous system is organized into circuits with specifically matched and tuned cell-to-cell connections that are essential for proper function. The mechanisms by which presynaptic axon terminals and postsynaptic dendrites recognize each other and establish the correct number of connections are still incompletely understood. Sperry's chemoaffinity hypothesis proposes that pre- and postsynaptic partners express specific combinations of molecules that enable them to recognize each other. Alternatively, Peters' rule proposes that presynaptic axons and postsynaptic dendrites use non-partner-derived global positional cues to independently reach their target area, and once there they randomly connect with any available neuron. These connections can then be further refined by additional mechanisms based on synaptic activity. We used the tractable genetic model system, the Drosophila embryo and larva, to test these hypotheses and elucidate the roles of 1) global positional cues, 2) partner-derived cues and 3) synaptic activity in the establishment of selective connections in the developing nerve cord. We altered the position or activity of presynaptic partners and analyzed the effect of these manipulations on the number of synapses with specific postsynaptic partners, strength of functional connections, and behavior controlled by these neurons. For this purpose, we combined developmental live imaging, electron microscopy reconstruction of circuits, functional imaging of neuronal activity, and behavioral experiments in wildtype and experimental animals. We found that postsynaptic dendrites are able to find, recognize, and connect to their presynaptic partners even when these have been shifted to ectopic locations through the overexpression of receptors for midline guidance cues. This suggests that neurons use partner-derived cues that allow them to identify and connect to each other. However, while partner-derived cues are sufficient for recognition between specific partners and establishment of connections; without orderly positioning of axon terminals by positional cues and without synaptic activity during embryonic development, the numbers of functional connections are altered with significant consequences for behavior. Thus, multiple mechanisms including global positional cues, partner-derived cues, and synaptic activity contribute to proper circuit assembly in the developing Drosophila nerve cord.
389 downloads bioinformatics
Proteus is a package for downstream analysis of MaxQuant evidence data in the R environment. It provides tools for peptide and protein aggregation, quality checks, data exploration and visualisation. Interactive analysis is implemented in the Shiny framework, where individual peptides or protein may be examined in the context of a volcano plot. Proteus performs differential expression analysis with the well-established tool limma, which offers robust treatment of missing data, frequently encountered in label-free mass-spectrometry experiments. We demonstrate on real and simulated data that limma results in improved sensitivity over random imputation combined with a t-test as implemented in the popular package Perseus. Embedding Proteus in R provides access to a wide selection of statistical and graphical tools for further analysis and reproducibility by scripting. Availability and implementation: The open-source R package, including example data and tutorials, is available to install from GitHub (https://github.com/bartongroup/proteus).
389 downloads genomics
Dave T Gerrard, Andrew A Berry, Rachel E Jennings, Matthew J Birket, Sarah J Withey, Patrick Short, Sandra Jimenez-Gancedo, Panos Firbas, Ian Donaldson, Andrew D. Sharrocks, Karen Piper Hanley, Matthew E Hurles, Jose Luis Gomez-Skarmeta, Nicoletta Bobola, Neil A Hanley
How the genome activates or silences transcriptional programmes governs organ formation. Little is known in human embryos undermining our ability to benchmark the fidelity of in vitro stem cell differentiation or cell programming, or interpret the pathogenicity of noncoding variation. Here, we studied histone modifications across thirteen tissues during human organogenesis. We integrated the data with transcription to build the first overview of how the human genome differentially regulates alternative organ fates including by repression. Promoters from nearly 20,000 genes partitioned into discrete states without showing bivalency. Key developmental gene sets were actively repressed outside of the appropriate organ. Candidate enhancers, functional in zebrafish, allowed imputation of tissue-specific and shared patterns of transcription factor binding. Overlaying more than 700 noncoding mutations from patients with developmental disorders allowed correlation to unanticipated target genes. Taken together, the data provide a new, comprehensive genomic framework for investigating normal and abnormal human development.
388 downloads cell biology
Labeling and tracking biomolecules with fluorescent probes on the single molecule level enables quantitative insights into their dynamics in living cells. We previously developed Riboglow, a platform to label RNAs in live mammalian cells, consisting of a short RNA tag and a small organic probe that increases fluorescence upon binding RNA. Here, we demonstrate that Riboglow is capable of detecting and tracking single RNA molecules. We benchmark RNA tracking by comparing results with the established MS2 RNA tagging system. To demonstrate versatility of Riboglow, we assay translation on the single molecule level, where the translated mRNA is tagged with Riboglow and the nascent polypeptide is labeled with a fluorescent antibody. The growing effort to investigate RNA biology on the single molecule level requires sophisticated and diverse fluorescent probes for multiplexed, multi-color labeling of biomolecules of interest, and we present Riboglow as a new member in this toolbox.
388 downloads genetics
Wolfgang Haak, Iosif Lazaridis, Nick Patterson, Nadin Rohland, Swapan Mallick, Bastien Llamas, Guido Brandt, Susanne Nordenfelt, Eadaoin Harney, Kristin Stewardson, Qiaomei Fu, Alissa Mittnik, Eszter Bánffy, Christos Economou, Michael Francken, Susanne Friederich, Rafael Garrido Pena, Fredrik Hallgren, Valery Khartanovich, Aleksandr Khokhlov, Michael Kunst, Pavel Kuznetsov, Harald Meller, Oleg Mochalov, Vayacheslav Moiseyev, Nicole Nicklisch, Sandra L. Pichler, Roberto Risch, Manuel A. Rojo Guerra, Christina Roth, Anna Szécsényi-Nagy, Joachim Wahl, Matthias Meyer, Johannes Krause, Dorcas Brown, David Anthony, Alan Cooper, Kurt Werner Alt, David Reich
We generated genome-wide data from 69 Europeans who lived between 8,000-3,000 years ago by enriching ancient DNA libraries for a target set of almost four hundred thousand polymorphisms. Enrichment of these positions decreases the sequencing required for genome-wide ancient DNA analysis by a median of around 250-fold, allowing us to study an order of magnitude more individuals than previous studies and to obtain new insights about the past. We show that the populations of western and far eastern Europe followed opposite trajectories between 8,000-5,000 years ago. At the beginning of the Neolithic period in Europe, ~8,000-7,000 years ago, closely related groups of early farmers appeared in Germany, Hungary, and Spain, different from indigenous hunter-gatherers, whereas Russia was inhabited by a distinctive population of hunter-gatherers with high affinity to a ~24,000 year old Siberian6. By ~6,000-5,000 years ago, a resurgence of hunter-gatherer ancestry had occurred throughout much of Europe, but in Russia, the Yamnaya steppe herders of this time were descended not only from the preceding eastern European hunter-gatherers, but from a population of Near Eastern ancestry. Western and Eastern Europe came into contact ~4,500 years ago, as the Late Neolithic Corded Ware people from Germany traced ~3/4 of their ancestry to the Yamnaya, documenting a massive migration into the heartland of Europe from its eastern periphery. This steppe ancestry persisted in all sampled central Europeans until at least ~3,000 years ago, and is ubiquitous in present-day Europeans. These results provide support for the theory of a steppe origin of at least some of the Indo-European languages of Europe.
386 downloads genomics
Recent genome-wide association studies in stroke have enabled the generation of genomic risk scores (GRS) but the predictive power of these GRS has been modest in comparison to established stroke risk factors. Here, using a meta-scoring approach, we developed a metaGRS for ischaemic stroke (IS) and analysed this score in the UK Biobank (n=395,393; 3075 IS events by age 75). The metaGRS hazard ratio for IS (1.26, 95% CI 1.22-1.31 per standard deviation increase of the score) doubled that of previous GRS, enabling the identification of a subset of individuals at monogenic levels of risk: individuals in the top 0.25% of metaGRS had a three-fold increased risk of IS. The metaGRS was similarly or more predictive when compared to established risk factors, such as family history, blood pressure, body mass index and smoking status. For participants within accepted guideline levels for established stroke risk factors, we found substantial variation in incident stroke rates across genomic risk backgrounds. We further estimated combinations of reductions needed in modifiable risk factors for individuals with different levels of genomic risk and suggest that, for individuals with high metaGRS, achieving currently recommended risk factor levels may be insufficient to mitigate risk.
385 downloads scientific communication and education
Good scientific writing is essential to career development and to the progress of science. A well-structured manuscript allows readers and reviewers to get excited about the subject matter, to understand and verify the paper's contributions, and to integrate these contributions into a broader context. However, many scientists struggle with producing high-quality manuscripts and typically get little training in paper writing. Focusing on how readers consume information, we present a set of 10 simple rules to help you get across the main idea of your paper. These rules are designed to make your paper more influential and the process of writing more efficient and pleasurable.
385 downloads animal behavior and cognition
Nervous systems have evolved to combine environmental information with internal state to select and generate adaptive behavioral sequences. To better understand these computations and their implementation in neural circuits, natural behavior must be carefully measured and quantified. Here, we collect high spatial resolution video of single zebrafish larvae swimming in a naturalistic environment and develop models of their action selection across exploration and hunting. Zebrafish larvae swim in punctuated bouts separated by longer periods of rest called interbout intervals. We take advantage of this structure by categorizing bouts into discrete types and representing their behavior as labeled sequences of bout-types emitted over time. We then construct probabilistic models - specifically, marked renewal processes - to evaluate how bout-types and interbout intervals are selected by the fish as a function of its internal hunger state, behavioral history, and the locations and properties of nearby prey. Finally, we evaluate the models by their predictive likelihood and their ability to generate realistic trajectories of virtual fish swimming through simulated environments. Our simulations capture multiple timescales of structure in larval zebrafish behavior and expose many ways in which hunger state influences their action selection to promote food seeking during hunger and safety during satiety.
384 downloads biochemistry
Bacteria are continually challenged by foreign invaders including bacteriophages, and have evolved a variety of defenses against these invaders. Here, we describe the structural and biochemical mechanisms of a bacteriophage immunity pathway found in a broad array of bacteria, including pathogenic E. coli and Pseudomonas aeruginosa . This pathway employs eukaryotic-like HORMA domain proteins that recognize specific peptides, then bind and activate a cGAS/DncV-like nucleotidyltransferase (CD-NTase) to generate a cyclic tri-AMP (cAAA) second messenger; cAAA in turn activates an endonuclease effector, NucC. Signaling is attenuated by a homolog of the AAA+ ATPase Pch2/TRIP13, which binds and likely disassembles the active HORMA-CD-NTase complex. When expressed in non-pathogenic E. coli , this pathway confers immunity against bacteriophage λ infection. Our findings reveal the molecular mechanisms of a bacterial defense pathway integrating a cGAS-like nucleotidyltransferase with HORMA domain proteins for threat sensing through protein detection, and negative regulation by a Pch2-like ATPase.
383 downloads immunology
The human naive T-cell receptor (TCR) repertoire is extremely diverse and accurately estimating its distribution is challenging. We address this challenge by combining a quantitative sequencing protocol of TCRA and TCRB sequences with computational modelling. We observed the vast majority of TCR chains only once in our samples, confirming the enormous diversity of the naive repertoire. However, a substantial number of sequences were observed multiple times within samples, and we demonstrated that this is due to expression by many cells in the naive pool. We reason that α and β chains are frequently observed due to a combination of selective processes and summation over multiple clones expressing these chains. We test the contribution of both mechanisms by predicting samples from phenomenological and mechanistically modelled repertoire distributions. By comparing these with sequencing data, we show that frequently observed chains are likely to be derived from multiple clones. Still, a neutral model of T-cell homeostasis cannot account for the observed distributions. We conclude that the data are only compatible with distributions of many small clones in combination with a sufficient number of very large naive T-cell clones, the latter most likely as a result of peripheral selection.
383 downloads genomics
Understanding the relationship between the genome of a cell and its phenotype is a central problem in precision medicine. Nonetheless, genotype-to-phenotype prediction comes with great challenges for machine learning algorithms that limit their use in this setting. The high dimensionality of the data tends to hinder generalization and challenges the scalability of most learning algorithms. Additionally, most algorithms produce models that are complex and difficult to interpret. We alleviate these limitations by proposing strong performance guarantees, based on sample compression theory, for rule-based learning algorithms that produce highly interpretable models. We show that these guarantees can be leveraged to accelerate learning and improve model interpretability. Our approach is validated through an application to the genomic prediction of antimicrobial resistance, an important public health concern. Highly accurate models were obtained for 12 species and 56 antibiotics, and their interpretation revealed known resistance mechanisms, as well as some potentially new ones. An open-source disk-based implementation that is both memory and computationally efficient is provided with this work. The implementation is turnkey, requires no prior knowledge of machine learning, and is complemented by comprehensive tutorials.
383 downloads microbiology
Pyrazinamide is one of four first-line antibiotics used to treat tuberculosis. While phenotypic antibiotic susceptibility testing for pyrazinamide is problematic, genetic variation in pncA drives pyrazinamide resistance in clinical isolates. Using a derivation dataset of 291 non-redundant, missense pncA mutations with high-confidence phenotypes, we trained machine learning models to predict pyrazinamide resistance based on sequence- and structure-based features. The models were further benchmarked by predicting the pyrazinamide resistance phenotype of 2,292 clinical isolates harboring pncA missense mutations. The probabilities of resistance predicted by the model were compared with in vitro pyrazinamide minimum inhibitory concentrations of 71 isolates to determine whether the machine learning model could predict the degree of resistance. This capacity of this approach to predict the effects of all pncA missense mutations improves the sensitivity and specificity of pyrazinamide resistance prediction in genetics-based clinical microbiology workflows for tuberculosis and provides a proof-of-concept for other drugs.
380 downloads genomics
The sexually transmitted pathogen Neisseria gonorrhoeae is regarded as being on the way to becoming an untreatable superbug. Despite its clinical importance, little is known about its emergence and evolution, and how this corresponds with the introduction of antimicrobials. We present a genome-based phylogeographic analysis of 419 gonococcal isolates from across the globe. Results indicate that modern gonococci originated in Europe or Africa as late as the 16th century and subsequently disseminated globally. We provide evidence that the modern gonococcal population has been shaped by antimicrobial treatment of sexually transmitted and other infections, leading to the emergence of two major lineages with different evolutionary strategies. The well-described multi-resistant lineage is associated with high rates of homologous recombination and infection in high-risk sexual networks where antimicrobial treatment is frequent. A second, multi-susceptible lineage associated with heterosexual networks, where asymptomatic infection is more common, was also identified, with potential implications for infection control.
380 downloads neuroscience
Neuronal inactivation is commonly used to assess the involvement of groups of neurons in specific brain functions. Optogenetic tools allow manipulations of genetically and spatially defined neuronal populations with excellent temporal resolution. However, the targeted neurons are coupled with other neural populations over multiple length scales. As a result, the effects of localized optogenetic manipulations are not limited to the targeted neurons, but produces spatially extended excitation and inhibition with rich dynamics. Here we benchmarked several optogenetic silencers in transgenic mice and with viral gene transduction, with the goal to inactivate excitatory neurons in small regions of neocortex. We analyzed the effects of the perturbations in vivo using electrophysiology. Channelrhodopsin activation of GABAergic neurons produced more effective photoinhibition of pyramidal neurons than direct photoinhibition using light-gated ion pumps. We made transgenic mice expressing the light-dependent chloride channel GtACR under the control of Cre-recombinase. Activation of GtACR produced the most potent photoinhibition. For all methods, localized photostimuli produced photoinhibition that extended substantially beyond the spread of light in tissue, although different methods had slightly different resolution limits (radius of inactivation, 0.5 mm to 1 mm). The spatial profile of photoinhibition was likely shaped by strong coupling between cortical neurons. Over some range of photostimulation, circuits produced the "paradoxical effect", where excitation of inhibitory neurons reduced activity in these neurons, together with pyramidal neurons, a signature of inhibition-stabilized neural networks. The offset of optogenetic inactivation was followed by rebound excitation in a light dose-dependent manner, which can be mitigated by slowly varying photostimuli, but at the expense of time resolution. Our data offer guidance for the design of in vivo optogenetics experiments and suggest how these experiments can reveal operating principles of neural circuits.
380 downloads bioengineering
Analyzing the spatial organization of molecules in cells and tissues is a cornerstone of biological research and clinical practice. However, despite enormous progress in profiling the molecular constituents of cells, spatially mapping these constituents remains a disjointed and machinery-intensive process, relying on either light microscopy or direct physical registration and capture. Here, we demonstrate DNA microscopy, a new imaging modality for scalable, optics-free mapping of relative biomolecule positions. In DNA microscopy of transcripts, transcript molecules are tagged in situ with randomized nucleotides, labeling each molecule uniquely. A second in situ reaction then amplifies the tagged molecules, concatenates the resulting copies, and adds new randomized nucleotides to uniquely label each concatenation event. An algorithm decodes molecular proximities from these concatenated sequences, and infers physical images of the original transcripts at cellular resolution. Because its imaging power derives entirely from diffusive molecular dynamics, DNA microscopy constitutes a chemically encoded microscopy system.
379 downloads cancer biology
Russell C. Rockne, Sergio Branciamore, Jing Qi, David Frankhouser, Denis O'Meally, Wei-Kai Hua, Guerry J. Cook, Lianjun Zhang, Emily Carnahan, Ayelet Marom, Herman Wu, Davide Maestrini, Xiwei Wu, Yate-Ching Yuan, Zheng Liu, Leo D. Wang, Stephen Forman, Nadia Carlesso, Ya-Huei Kuo, Guido Marcucci
Temporal dynamics of gene expression are informative of changes associated with disease development and evolution. Given the complexity of high-dimensional temporal datasets, an analytical framework guided by a robust theory is needed to interpret time-sequential changes and to predict system dynamics. Herein, we use acute myeloid leukemia as a proof-of-principle to model gene expression dynamics in a transcriptome state-space constructed based on time-sequential RNA-sequencing data. We describe the construction of a state-transition model to identify state-transition critical points which accurately predicts leukemia development. We show an analytical approach based on state-transition critical points identified step-wise transcriptomic perturbations driving leukemia progression. Furthermore, the gene(s) trajectory and geometry of the transcriptome state-space provides biologically-relevant gene expression signals that are not synchronized in time, and allows quantification of gene(s) contribution to leukemia development. Therefore, our state-transition model can synthesize information, identify critical points to guide interpretation of transcriptome trajectories and predict disease development.
- Top preprints of 2018
- Paper search
- Author leaderboards
- Overall metrics
- The API
- Email newsletter
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!