Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 57,294 bioRxiv papers from 263,837 authors.
Most downloaded bioRxiv papers, since beginning of last month
55,718 results found. For more information, click each entry to expand.
293 downloads neuroscience
Neuronal ensembles that hold specific memory (memory engrams) have been identified in the hippocampus, amygdala, and cortex. It has been hypothesized that engrams for a specific memory are distributed among multiple brain regions that are functionally connected. Here, we report the hitherto most extensive engram map for contextual fear memory by characterizing activity-tagged neurons in 409 regions using SHIELD-based tissue phenotyping. The mapping was aided by a novel engram index, which identified cFos+ brain regions holding engrams with a high probability. Optogenetic manipulations confirmed previously known engrams and revealed new engrams. Many of these engram holding-regions were functionally connected to the CA1 or amygdala engrams. Simultaneous chemogenetic reactivation of multiple engrams, which mimics natural memory recall, conferred a greater level of memory recall than reactivation of a single engram ensemble. Overall, our study supports the hypothesis that a memory is stored in functionally connected engrams distributed across multiple brain regions.
293 downloads pathology
A quantitative model to genetically interpret the histology in whole microscopy slide images is desirable to guide downstream immunohistochemistry, genomics, and precision medicine. We constructed a statistical model that predicts whether or not SPOP is mutated in prostate cancer, given only the digital whole slide after standard hematoxylin and eosin [H&E] staining. Using a TCGA cohort of 177 prostate cancer patients where 20 had mutant SPOP, we trained multiple ensembles of residual networks, accurately distinguishing SPOP mutant from SPOP non-mutant patients (test AUROC=0.74, p=0.0007 Fisher's Exact Test). We further validated our full metaensemble classifier on an independent test cohort from MSK-IMPACT of 152 patients where 19 had mutant SPOP. Mutants and non-mutants were accurately distinguished despite TCGA slides being frozen sections and MSK-IMPACT slides being formalin-fixed paraffin-embedded sections (AUROC=0.86, p=0.0038). Moreover, we scanned an additional 36 MSK-IMPACT patient having mutant SPOP, trained on this expanded MSK-IMPACT cohort (test AUROC=0.75, p=0.0002), tested on the TCGA cohort (AUROC=0.64, p=0.0306), and again accurately distinguished mutants from non-mutants using the same pipeline. Importantly, our method demonstrates tractable deep learning in this "small data" setting of 20-55 positive examples and quantifies each prediction's uncertainty with confidence intervals. To our knowledge, this is the first statistical model to predict a genetic mutation in cancer directly from the patient's digitized H&E-stained whole microscopy slide. Moreover, this is the first time quantitative features learned from patient genetics and histology have been used for content-based image retrieval, finding similar patients for a given patient where the histology appears to share the same genetic driver of disease i.e. SPOP mutation (p=0.0241 Kost's Method), and finding similar patients for a given patient that does not have have that driver mutation (p=0.0170 Kost's Method).
292 downloads genomics
Ziyi Xiong, Gabriela Dankova, Laurence Howe Howe, Myoung Lee, Pirro Hysi, Markus de Jong, Gu Zhu, Kaustubh Adhikari, Dan Li, Yi Li, Bo Pan, Eleanor Feingold, Mary Marazita, John Shaffer, Kerrie McAloney, Shuhua Xu, Li Jin, Sijia Wang, Femke Vrij, Bas Lendemeijer, Stephen Richmond, Alexei Zhurov, Sarah Lewis, Gemma Sharp, Lavinia Paternoster, Holly Thompson, Rolando Gonzales-Jose, Maria Catira Bortolini, Samuel Canizales-Quinteros, Carla Gallo, Giovanni Poletti, Gabriel Bedoya, Francisco Rothhammer, Andre Uitterlinden, M Arfan Ikram, Eppo Wolvius, Steven Kushner, Tamar Nijsten, Robert-Jan Palstra, Stefan Boehringer, Sarah Medland, Kun Tang, Andres Ruiz-Linares, Nicholas Martin, Timothy Spector, Evie Stergiakouli, Seth Weinberg, Fan Liu, Manfred Kayser
The human face represents a combined set of highly heritable phenotypes, but knowledge on its genetic architecture remains limited despite the relevance for various fields of science and application. A series of genome-wide association studies on 78 facial shape phenotypes quantified from 3-dimensional facial images of 10,115 Europeans identified 24 genetic loci reaching genome-wide significant association, among which 17 were previously unreported. A multi-ethnic study in additional 7,917 individuals confirmed 13 loci including 8 unreported ones. A global map of polygenic face scores assembled facial features in major continental groups consistent with anthropological knowledge. Analyses of epigenomic datasets from cranial neural crest cells revealed abundant cis-regulatory activities at the face-associated genetic loci. Luciferase reporter assays in neural crest progenitor cells highlighted enhancer activities of several face-associated DNA variants. These results substantially advance our understanding of the genetic basis underlying human facial variation and provide candidates for future in-vivo functional studies.
292 downloads bioinformatics
One common task in Computational Biology is the prediction of aspects of protein function and structure from their amino acid sequence. For 26 years, most state-of-the-art approaches toward this end have been marrying machine learning and evolutionary information, resulting in the need to retrieve related proteins at increasing cost from ever growing sequence databases. This search is so time-consuming to often render the analysis of entire proteomes infeasible. On top, evolutionary information is less powerful for small families, e.g. for proteins from the Dark Proteome . Here, we introduced a novel way to represent protein sequences as continuous vectors ( embeddings ) by using the deep bidirectional language model ELMo. The model effectively captured the biophysical properties of protein sequences from unlabeled big data (UniRef50). We showed how, after training, this knowledge was transferred to single protein sequences by predicting relevant sequence features. We referred to these new embeddings as SeqVec ( Seq uence-to- Vec tor) and demonstrated their effectiveness by training simple neural networks on existing data sets for two completely different prediction tasks. At the per-residue level, we improved secondary structure (for NetSurfP-2.0 data set: Q3=79%±1, Q8=68%±1) and disorder predictions (MCC=0.59±0.03) that use only single protein sequences by a large margin. At the per-protein level, we predicted subcellular localization in ten classes (for DeepLoc dataset: Q10=68%±1) and distinguished membrane-bound from water-soluble proteins (Q2= 87%±1). All results built upon embeddings gained from the new tool SeqVec. These are derived from the target protein’s sequence alone. Where the lightning-fast HHblits needed on average several minutes to generate the evolutionary information for a target protein, SeqVec created the vector representation on average in 0.027 seconds. Availability SeqVec : <https://github.com/Rostlab/SeqVec> Prediction server : <https://embed.protein.properties> * 1D : one-dimensional – information representable in a string such as secondary structure or solvent accessibility 3D : three-dimensional 3D structure : three-dimensional coordinates of protein structure MCC : Matthews-Correlation-Coefficient RSA : relative solvent accessibility
292 downloads bioinformatics
We introduce a simple new approach to variable selection in linear regression, and to quantifying uncertainty in selected variables. The approach is based on a new model -- the "Sum of Single Effects" (SuSiE) model -- which comes from writing the sparse vector of regression coefficients as a sum of "single-effect" vectors, each with one non-zero element. We also introduce a corresponding new fitting procedure -- Iterative Bayesian Stepwise Selection (IBSS) -- which is a Bayesian analogue of traditional stepwise selection methods. IBSS shares the computational simplicity and speed of traditional stepwise methods, but instead of selecting a single variable at each step, IBSS computes a distribution on variables that captures uncertainty in which variable to select. We show that the IBSS algorithm computes a variational approximation to the posterior distribution under the SuSiE model. Further, this approximate posterior distribution naturally leads to a convenient, novel, way to summarize uncertainty in variable selection, and provides a Credible Set for each selected variable. Our methods are particularly well suited to settings where variables are highly correlated and true effects are very sparse, both of which are characteristics of genetic fine-mapping applications. We demonstrate through numerical experiments that our methods outperform existing methods for this task, and illustrate the methods by fine-mapping genetic variants that influence alternative splicing in human cell-lines. We also discuss both the potential and the challenges for applying these methods to generic variable selection problems.
292 downloads pharmacology and toxicology
Psychiatric and neurodegenerative illnesses are characterized by cognitive impairments, in particular deficits in working memory, decision making, and executive functions including cognitive flexibility. However, the neuropharmacology of these cognitive functions is poorly understood. The serotonin (5-HT) 2A receptor might be a promising candidate for the modulation of cognitive processes. However, pharmacological studies investigating the role of this receptor system in humans are rare. Recent evidence demonstrates that the effects of Lysergic acid diethylamide (LSD) are mediated via agonistic action at the 5-HT2A receptor. Yet, the effects of LSD on specific cognitive domains using standardized neuropsychological test have not been studied. Therefore, we examined the acute effects of LSD (100 micrograms) alone and in combination with the 5-HT2A antagonist ketanserin (40mg) on cognition, employing a double-blind, randomized, placebo-controlled, within-subject design in 25 healthy participants. Executive functions, cognitive flexibility, spatial working memory, and risk-based decision-making were examined by the Intra/Extra-Dimensional shift task (IED), Spatial Working Memory task (SWM), and Cambridge Gambling Task (CGT) of the Cambridge Neuropsychological Test Automated Battery. Compared to placebo, LSD significantly impaired executive functions, cognitive flexibility, and working memory on the IED and SWM, but did not influence quality of decision-making and risk-taking on the CGT. Pretreatment with the 5-HT2A antagonist ketanserin normalized all LSD-induced cognitive deficits. The present findings highlight the role of the 5-HT2A receptor system in executive functions and working memory and suggest that specific 5-HT2A antagonists may be relevant for improving cognitive dysfunctions in psychiatric disorders.
292 downloads developmental biology
The evolution of embryological development has long been characterized by deep conservation. Both morphological and transcriptomic surveys have proposed a 'hourglass' model of Evo-Devo. A stage in mid-embryonic development, the phylotypic stage, is highly conserved among species within the same phylum. However, the reason for this phylotypic stage is still elusive. Here we hypothesize that the phylotypic stage might be characterized by selection for robustness to noise and environmental perturbations. This could lead to mutational robustness, thus evolutionary conservation of expression and the hourglass pattern. To test this, we quantified expression variability of single embryo transcriptomes throughout fly Drosophila melanogaster embryogenesis. We found that indeed expression variability is lower at extended germband, the phylotypic stage. We explain this pattern by stronger histone modification mediated transcriptional noise control at this stage. In addition, we find evidence that histone modifications can also contribute to mutational robustness in regulatory elements. Thus, the robustness to noise does indeed contributes to robustness of gene expression to genetic variations, and to the conserved phylotypic stage.
291 downloads immunology
The targeting of metabolic pathways is emerging as an exciting new approach for modulating immune cell function and polarization states. In this study, carbon tracing and systems biology approaches integrating metabolomic and transcriptomic profiling data were used to identify adaptations in human T cell metabolism important for fueling pro-inflammatory T cell function. Results of this study demonstrate that T cell receptor (TCR) stimulation leads to a significant increase in glucose and amino acid metabolism that trigger downstream biosynthetic processes. Specifically, increased expression of several enzymes such as CTPS1, IL4I1, and ASL results in the reprogramming of amino acid metabolism. Additionally, the strength of TCR signaling resulted in different metabolic enzymes utilized by T cells to facilitate similar biochemical endpoints. Furthermore, this study shows that cyclosporine represses the pathways involved in amino acid and glucose metabolism, providing novel insights on the immunosuppressive mechanisms of this drug. To explore the implications of the findings of this study in clinical settings, conventional immunosuppressants were tested in combination with drugs that target metabolic pathways. Results showed that such combinations increased efficacy of conventional immunosuppressants. Overall, the results of this study provide a comprehensive resource for identifying metabolic targets for novel combinatorial regimens in the treatment of intractable immune diseases.
290 downloads developmental biology
Characterization of the morphological structure during hair follicle development has been well documented, while the current understanding of the molecular mechanisms involved in follicle development remain limited. Here, using unbiased single-cell RNA sequencing, we analyzed 15,086 single cell transcriptome profiles from E13.5 and E16.5 fetal mice, and newborn mouse (postnatal day 0, P0) dorsal skin cells. Based on t-distributed Stochastic Neighbor Embedding (tSNE) clustering, we identified 14 cell clusters from skin cells and delineated their cell identity gene expression profiles. Pseudotime ordering analysis successfully constructed epithelium/dermal cell lineage differentiation trajectory and revealed sequential activation of key regulons involved during cell fate decisions. Along with this, intercellular communication between different cell populations were inferred based on a priori knowledge of ligand-receptor pairs. Together, our findings here provide a molecular landscape during hair follicle epithelium/dermal cell lineage fate decisions, and more importantly, recapitulate sequential activation of core regulatory transcriptional factors for different cell populations during hair follicle morphogenesis.
290 downloads genomics
RNA sequencing (RNA-seq) is a sensitive and accurate method for quantifying gene expression. Small samples or those whose RNA is degraded, such as formalin-fixed, paraffin-embedded (FFPE) tissue, remain challenging to study with nonspecialized RNA-seq protocols. Here we present a new method, Smart-3SEQ, that accurately quantifies transcript abundance even with small amounts of total RNA and effectively characterizes small samples extracted by laser-capture microdissection (LCM) from FFPE tissue. We also obtain distinct biological profiles from FFPE single cells, which have been impossible to study with previous RNA-seq protocols, and we use these data to identify possible new macrophage phenotypes associated with the tumor microenvironment. We propose Smart-3SEQ as a highly cost-effective method to enable large gene-expression profiling experiments unconstrained by sample size and tissue availability. In particular, Smart-3SEQ's compatibility with FFPE tissue unlocks an enormous number of archived clinical samples, and combined with LCM it allows unprecedented studies of small cell populations and single cells isolated by their in situ context.
290 downloads neuroscience
CLARITY is a tissue clearing method, which enables immunostaining and imaging of large volumes for 3D-reconstruction. The method was initially time-consuming, expensive and relied on electrophoresis to remove lipids to make the tissue transparent. Since then several improvements and simplifications have emerged, such as passive clearing (PACT) and methods to improve tissue staining. Here, we review advances and compare current applications with the aim of highlighting needed improvements as well as aiding selection of the specific protocol for use in future investigations.
290 downloads microbiology
Understanding the drivers of microbial diversity is a fundamental question in microbial ecology. Extensive literature discusses different methods for describing microbial diversity and documenting its effects on ecosystem function. However, it is widely believed that diversity depends on the number of reads that are sequenced. I discuss a statistical perspective on diversity, framing the diversity of an environment as an unknown parameter, and discussing the bias and variance of plug-in and rarefied estimates. I argue that by failing to account for both bias and variance, we invalidate analysis of alpha diversity. I describe the state of the statistical literature for addressing these problems, and suggest that measurement error modeling can address issues with variance, but bias corrections need to be utilized as well. I encourage microbial ecologists to avoid motivating their investigations with alpha diversity analyses that do not use valid statistical methodology.
289 downloads neuroscience
A decade after the first successful attempt to decode speech directly from human brain signals, accuracy and speed remain far below that of natural speech or typing. Here we show how to achieve high accuracy from the electrocorticogram at natural-speech rates, even with few data (on the order of half an hour of spoken speech). Taking a cue from recent advances in machine translation and automatic speech recognition, we train a recurrent neural network to map neural signals directly to word sequences (sentences). In particular, the network first encodes a sentence-length sequence of neural activity into an abstract representation, and then decodes this representation, word by word, into an English sentence. For each participant, training data consist of several spoken repeats of a set of some 30-50 sentences, along with the corresponding neural signals at each of about 250 electrodes distributed over peri-Sylvian speech cortices. Average word error rates across a validation (held-out) sentence set are as low as 7% for some participants, as compared to the previous state of the art of greater than 60%. Finally, we show how to use transfer learning to overcome limitations on data availability: Training certain components of the network under multiple participants' data, while keeping other components (e.g., the first hidden layer) "proprietary," can improve decoding performance--despite very different electrode coverage across participants.
289 downloads microbiology
Termite mounds have recently been confirmed to mitigate approximately half of termite methane (CH4) emissions, but the aerobic methane-oxidizing bacteria (methanotrophs) responsible for this consumption have not been resolved. Here we describe the abundance, composition, and kinetics of the methanotroph communities in the mounds of three distinct termite species. We show that methanotrophs are rare members of the termite mound biosphere and have a comparable abundance, but distinct composition, to those of adjoining soil samples. Across all mounds, the most abundant and prevalent particulate methane monooxygenase sequences detected were affiliated with Upland Soil Cluster α (USCα), with sequences homologous to Methylocystis and Tropical Upland Soil Cluster also detected. The Michaelis-Menten kinetics of CH4 oxidation in mounds were estimated from in situ reaction rates. The apparent CH4 affinities of the communities were in the low micromolar range, which is one to two orders of magnitude higher than those of upland soils, but significantly lower than those measured in soils with a large CH4 source such as landfill-cover soils. The rate constant of CH4 oxidation, as well as the porosity of the mound material, were significantly positively correlated with the abundance of methanotroph communities of termite mounds. We conclude that termite-derived CH4 emissions have selected for unique methanotroph communities that are kinetically adapted to elevated CH4 concentrations. However, factors other than substrate concentration appear to limit methanotroph abundance and hence these bacteria only partially mitigate termite-derived CH4 emissions. Our results also highlight the predominant role of USCα in an environment with elevated CH4 concentrations and suggest a higher functional diversity within this group than previously recognised.
289 downloads biochemistry
ESCRT-III proteins assemble into ubiquitous membrane-remodeling polymers during many cellular processes. Here we describe the structure of helical membrane tubes that are scaffolded by bundled ESCRT-III filaments. Cryo-ET reveals how the shape of the helical membrane tube arises from the assembly of distinct bundles of protein filaments that bind the membrane with different mean curvatures. Cryo-EM reveals how one of these ESCRT-III filaments engages the membrane tube through a novel interface. Mathematical modeling of the helical membrane tube suggests how its shape emerges from differences in membrane binding energy, positional rigidity, and membrane tension. Altogether, our findings support a model in which increasing the rigidity of ESCRT-III filaments through the assembly of multi-strands triggers buckling of the membrane.
289 downloads neuroscience
Microglia play key roles in regulating synapse development and refinement in the developing brain, but it is unknown whether they are similarly involved during adult neurogenesis. By transiently ablating microglia from the healthy adult mouse brain, we show that microglia are necessary for the normal functional development of adult-born granule cells (abGCs) in the olfactory bulb. Microglia ablation reduces the odor responses of developing, but not preexisting GCs in vivo in both awake and anesthetized mice. Microglia preferentially target their motile processes to interact with mushroom spines on abGCs, and when microglia are absent, abGCs develop smaller spines and receive weaker excitatory synaptic inputs. These results suggest that microglia promote the development of excitatory synapses onto developing abGCs, which may impact the function of these cells in the olfactory circuit.
288 downloads genomics
Autism spectrum disorder (ASD) is a phenotypically and genetically heterogeneous neurodevelopmental disorder. Despite this heterogeneity, previous studies have shown patterns of molecular convergence in post-mortem brain tissue from autistic subjects. Here, we integrate genome-wide measures of mRNA expression, miRNA expression, DNA methylation, and histone acetylation from ASD and control brains to identify a convergent molecular subtype of ASD with shared dysregulation across both the epigenome and transcriptome. Focusing on this convergent subtype, we substantially expand the repertoire of differentially expressed genes in ASD and identify a component of upregulated immune processes that are associated with hypomethylation. We utilize eQTL and chromosome conformation datasets to link differentially acetylated regions with their cognate genes and identify an enrichment of ASD genetic risk variants in hyperacetylated noncoding regulatory regions linked to neuronal genes. These findings help elucidate how diverse genetic risk factors converge onto specific molecular processes in ASD.
288 downloads genomics
Isidro Cortes-Ciriano, June-Koo Lee, Ruibin Xi, Dhawal Jain, Youngsook L Jung, Lixing Yang, Dmitry Gordenin, Leszek J. Klimczak, Cheng-Zhong Zhang, David S Pellman, Peter J Park, PCAWG Structural Variation Working Group, ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Network
Chromothripsis is a newly discovered mutational phenomenon involving massive, clustered genomic rearrangements that occurs in cancer and other diseases. Recent studies in cancer suggest that chromothripsis may be far more common than initially inferred from low resolution DNA copy number data. Here, we analyze the patterns of chromothripsis across 2,658 tumors spanning 39 cancer types using whole-genome sequencing data. We find that chromothripsis events are pervasive across cancers, with a frequency of >50% in several cancer types. Whereas canonical chromothripsis profiles display oscillations between two copy number states, a considerable fraction of the events involves multiple chromosomes as well as additional structural alterations. In addition to non-homologous end-joining, we detect signatures of replicative processes and templated insertions. Chromothripsis contributes to oncogene amplification as well as to inactivation of genes such as mismatch-repair related genes. These findings show that chromothripsis is a major process driving genome evolution in human cancer.
288 downloads plant biology
The circadian clock regulates various physiological responses. To achieve this, both animals and plants have distinct circadian clocks in each tissue that are optimized for that tissue's respective functions. However, if and how the tissue-specific circadian clocks are involved in specification of cell types remains unclear. Here, by implementing a single-cell transcriptome with a new analytics pipeline, we have reconstructed an actual time-series of the cell differentiation process at single-cell resolution, and discovered that the Arabidopsis circadian clock is involved in the process of cell differentiation through transcription factor BRI1-EMS SUPPRESSOR 1 (BES1) signaling. In this pathway, direct repression of LATE ELONGATED HYPOCOTYL (LHY) expression by BES1 triggers reconstruction of the circadian clock in stem cells. The reconstructed circadian clock regulates cell differentiation through fine-tuning of key factors for epigenetic modification, cell-fate determination, and the cell cycle. Thus, the establishment of circadian systems precedes cell differentiation and specifies cell types.
288 downloads bioinformatics
Droplet-based microfluidic devices have become widely used to perform single-cell RNA sequencing (scRNA- seq) and discover novel cellular heterogeneity in complex biological systems. However, ambient RNA present in the cell suspension can be incorporated into these droplets and aberrantly counted along with a cell's native mRNA. This results in cross-contamination of transcripts between different cell populations and can potentially decrease the precision of downstream analyses. We developed a novel hierarchical Bayesian method called DecontX to estimate and remove contamination in individual cells from scRNA- seq data. DecontX accurately predicted the proportion of contaminated counts in a mixture of mouse and human cells. Decontamination of PBMC datasets removed aberrant expression of cell type specific marker genes from other cell types and improved overall separation of cell clusters. In general, DecontX can be incorporated into scRNA-seq workflows to assess quality of dissociation protocols and improve downstream analyses.
- Top preprints of 2018
- Paper search
- Author leaderboards
- Overall metrics
- The API
- Email newsletter
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!