Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 57,294 bioRxiv papers from 263,837 authors.
Most downloaded bioRxiv papers, since beginning of last month
55,837 results found. For more information, click each entry to expand.
271 downloads genomics
David Jakubosky, Matteo D'Antonio, Marc Jan Bonder, Craig Smail, Margaret K.R. Donovan, William W Young Greenwald, Agnieszka D'Antonio-Chronowska, Hiroko Matsui, i2QTL Consortium, Oliver Stegle, Erin N Smith, Stephen B. Montgomery, Christopher DeBoever, Kelly A Frazer
Structural variants (SVs) and short tandem repeats (STRs) comprise a broad group of diverse DNA variants which vastly differ in their sizes and distributions across the genome. Here, we show that different SV classes and STRs differentially impact gene expression and complex traits. Functional differences between SV classes and STRs include their genomic locations relative to eGenes, likelihood of being associated with multiple eGenes, associated eGene types (e.g., coding, noncoding, level of evolutionary constraint), effect sizes, linkage disequilibrium with tagging single nucleotide variants used in GWAS, and likelihood of being associated with GWAS traits. We also identified a set of high-impact SVs/STRs associated with the expression of three or more eGenes via chromatin loops and showed they are highly enriched for being associated with GWAS traits. Our study provides insights into the genomic properties of structural variant classes and short tandem repeats that impact gene expression and human traits.
271 downloads neuroscience
Sleep is a near universal phenomenon whose function remains controversial. An influential theory of sleep function posits that ecological factors that place animals in harm's way increase sleep as a state of adaptive inactivity. Here we find that manipulations that impair flight in Drosophila increase sleep. Further, we identify a novel neural pathway from peripheral wing sensory neurons to the central brain that mediates the change in sleep. Moreover, we show that flight impairments activate and induce structural plasticity in specific projection neurons to support increases in sleep over days. Thus, chemosensory neurons do not only signal sensory cues but also appear to provide information on wing-integrity to support behavioural adaptability. Together, these data provide mechanistic support of adaptive increases in sleep and highlight the importance of behavioural flexibility for fitness and survival.
271 downloads evolutionary biology
Performance tradeoffs are ubiquitous in both ecological and evolutionary modeling, yet are usually postulated and built into fitness and ecological landscapes. But tradeoffs depend on genetic background and evolutionary history, and can themselves evolve. We present a simple model capable of capturing the key feedback loop: evolutionary history shapes tradeoff strength, which, in turn, shapes evolutionary future. One consequence of this feedback is that genomes with identical fitness can have different evolutionary properties, shaped by prior environmental exposure. Another is that, generically, the best adaptations to one environment may evolve in another. Our minimal model highlights the need for analysis of simple models capable of incorporating explicit dependence on environment, and can serve as a rich playground for investigating evolution in multiple or changing environments.
270 downloads neuroscience
Julie A Harris, Stefan Mihalas, Karla E Hirokawa, Jennifer D Whitesell, Joseph Knox, Amy Bernard, Phillip Bohn, Shiella Caldejon, Linzy Casal, Andrew Cho, David Feng, Nathalie Gaudreault, Charles Gerfen, Nile Graddis, Peter A. Groblewski, Alex Henry, Anh Ho, Robert Howard, Leonard Kuan, Jerome Lecoq, Jennifer Luviano, Stephen McConoghy, Marty Mortrud, Maitham Naeemi, Lydia Ng, Seung W Oh, Benjamin Ouellette, Staci Sorensen, Wayne Wakeman, Quanxin Wang, Ali Williford, John Phillips, Allan Jones, Christof Koch, Hongkui Zeng
The mammalian cortex is a laminar structure composed of many cell types densely interconnected in complex ways. Recent systematic efforts to map the mouse mesoscale connectome provide comprehensive projection data on interareal connections, but not at the level of specific cell classes or layers within cortical areas. We present here a significant expansion of the Allen Mouse Brain Connectivity Atlas, with ~1,000 new axonal projection mapping experiments across nearly all isocortical areas in 49 Cre driver lines. Using 13 lines selective for cortical layer-specific projection neuron classes, we identify the differential contribution of each layer/class to the overall intracortical connectivity patterns. We find layer 5 (L5) projection neurons account for essentially all intracortical outputs. L2/3, L4, and L6 neurons contact a subset of the L5 cortical targets. We also describe the most common axon lamination patterns in cortical targets. Most patterns are consistent with previous anatomical rules used to determine hierarchical position between cortical areas (feedforward, feedback), with notable exceptions. While diverse target lamination patterns arise from every source layer/class, L2/3 and L4 neurons are primarily associated with feedforward type projection patterns and L6 with feedback. L5 has both feedforward and feedback projection patterns. Finally, network analyses revealed a modular organization of the intracortical connectome. By labeling interareal and intermodule connections as feedforward or feedback, we present an integrated view of the intracortical connectome as a hierarchical network.
270 downloads cell biology
Tendon disorders frequently occur and recent evidence has clearly implicated the presence of immune cells and inflammatory events during early tendinopathy. However, the origin and properties of these cells remain poorly defined. Therefore, the aim of this study was to determine the presence of myleoid cells in healthy rodent and human tendon tissue and to characterize them. Using various transgenic reporter mouse models, we demonstrate the presence of tendon cells in the dense matrix of the tendon core expressing the fractalkine (Fkn) receptor CX3CR1 and its cognate ligand CX3CL1/Fkn. Pro-inflammatory stimulation of 3D tendon-like constructs in vitro resulted in a significant increase in the expression of IL-1beta, IL-6, Mmp3, Mmp9, Cx3cl1, and epiregulin which has been reported to contribute to inflammation, wound healing, and tissue repair. Furthermore, we demonstrate that inhibition of the fractalkine receptor blocked tendon cell migration in vitro and show the presence of CX3CR1/CX3CL1/EREG expressing cells in healthy human tendons. Taken together, we demonstrate the presence of CX3CL1+/CX3CR1+ 'tenophages' within the healthy tendon proper potentially fulfilling surveillance functions in tendons.
270 downloads cancer biology
Livnat Jerby, Cyril Neftel, Marni E. Shore, Matthew J. McBride, Brian Haas, Benjamin Izar, Hannah R. Weissman, Angela Volorio, Gaylor Boulay, Luisa Cironi, Alyssa R. Richman, Liliane C. Broye, Joseph M. Gurski, Christina C. Luo, Ravindra Mylvaganam, Lan Nguyen, Shaolin Mei, Johannes c. Melms, Christophe Georgescu, Ofir Cohen, Jorge E Buendia-Buendia, Michael S Cuoco, Danny Labes, Daniel R. Zollinger, Joseph M. Beechem, Petur Nielsen, Ivan Chebib, Gregory Cote, Edwin Choy, Igor Letovanec, Stephane Cherix, Nikhil Wagle, Peter K Sorger, Alex B. Haynes, John T. Mullen, Ivan Stamenkovic, Miguel N. Rivera, Cigall Kadoch, Orit Rozenblatt-Rosen, Mario L. Suva, Nicolo Riggi, Aviv Regev
Synovial sarcoma is an aggressive mesenchymal neoplasm, driven by the SS18-SSX fusion, and characterized by immunogenic antigens expression and exceptionally low T cell infiltration levels. To study the cancer-immune interplay in this disease, we profiled 16,872 cells from 12 human synovial sarcoma tumors using single-cell RNA-sequencing (scRNA-Seq). Synovial sarcoma manifests antitumor immunity, high cellular plasticity and a core oncogenic program, which is predictive of low immune levels and poor clinical outcomes. Using genetic and pharmacological perturbations, we demonstrate that the program is controlled by the SS18-SSX driver and repressed by cytokines secreted by macrophages and T cells in the tumor microenvironment. Network modeling predicted that SS18-SSX promotes the program through HDAC1 and CDK6. Indeed, the combination of HDAC and CDK4/6 inhibitors represses the program, induces immunogenic cell states, and selectively targets synovial sarcoma cells. Our study demonstrates that immune evasion, cellular plasticity, and cell cycle are co-regulated and can be co-targeted in synovial sarcoma and potentially in other malignancies.
270 downloads biochemistry
Mass spectrometry is a powerful tool for quantifying protein abundance in complex samples. Advances in sample preparation and the development of data independent acquisition (DIA) mass spectrometry approaches have increased the number of peptides and proteins measured per sample. Here we present a series of experiments demonstrating how to assess whether a peptide measurement is quantitative by mass spectrometry. Our results demonstrate that increasing the number of detected peptides in a proteomics experiment does not necessarily result in increased numbers of peptides that can be measured quantitatively.
270 downloads bioinformatics
Single-cell RNA-Seq (scRNA-Seq) enables the systematic molecular characterization of heterogeneous tissues at an unprecedented resolution and scale. However, it is currently unclear how to establish formal cell type definitions, which impedes the systematic analysis of scRNA-Seq data across experiments and studies. To address this challenge, we have developed Moana, a hierarchical machine learning framework that enables the construction of robust cell type classifiers from heterogeneous scRNA-Seq datasets. To demonstrate Moana's capabilities, we construct cell type classifiers for human immune cells that accurately distinguish between closely related cell types in the presence of experimental perturbations and systematic differences between scRNA-Seq protocols. We show that Moana is generally applicable and scales to datasets with more than ten thousand cells, thus enabling the construction of tissue-specific cell type atlases that can be directly applied to analyze new scRNA-Seq datasets. A Python implementation of Moana can be found at https://github.com/yanailab/moana.
270 downloads molecular biology
CRISPR/Cas technologies have transformed our ability to manipulate genomes for research and gene-based therapy. In particular, homology-directed repair after genomic cleavage allows for precise modification of genes using exogenous donor sequences as templates. While both single-stranded DNA (ssDNA) and double-stranded DNA (dsDNA) forms of donors have been used as repair templates, a systematic comparison of the performance and specificity of repair using ssDNA versus dsDNA donors is still lacking. Here, we describe an optimized method for the synthesis of long ssDNA templates and demonstrate that ssDNA donors can drive efficient integration of gene-sized reporters in human cell lines. We next define a set of rules to maximize the efficiency of ssDNA-mediated knock-in by optimizing donor design. Finally, by comparing ssDNA donors with equivalent dsDNA sequences (PCR products or plasmids), we demonstrate that ssDNA templates have a unique advantage in terms of repair specificity while dsDNA donors can lead to a high rate of off-target integration. Our results provide a framework for designing high-fidelity CRISPR-based knock-in experiments, in both research and therapeutic settings.
269 downloads genomics
Single-cell RNA sequencing (scRNA-seq) can characterize cell types and states through unsupervised clustering, but the ever increasing number of cells imposes computational challenges. We present an unsupervised deep embedding algorithm for single-cell clustering (DESC) that iteratively learns cluster-specific gene expression signatures and cluster assignment. DESC significantly improves clustering accuracy across various datasets and is capable of removing complex batch effects while maintaining true biological variations.
269 downloads bioinformatics
Allen W Zhang, Ciara O'Flanagan, Elizabeth Chavez, Jamie LP Lim, Andrew McPherson, Matt Wiens, Pascale Walters, Tim Chan, Brittany Hewitson, Daniel Lai, Anja Mottok, Clementine Sarkozy, Lauren Chong, Tomohiro Aoki, Xuehai Wang, Andrew P Weng, Jessica N. McAlpine, Samuel Aparicio, Christian Steidl, Kieran R Campbell, Sohrab P Shah
Single-cell RNA sequencing (scRNA-seq) has transformed biomedical research, enabling decomposition of complex tissues into disaggregated, functionally distinct cell types. For many applications, investigators wish to identify cell types with known marker genes. Typically, such cell type assignments are performed through unsupervised clustering followed by manual annotation based on these marker genes, or via "mapping" procedures to existing data. However, the manual interpretation required in the former case scales poorly to large datasets, which are also often prone to batch effects, while existing data for purified cell types must be available for the latter. Furthermore, unsupervised clustering can be error-prone, leading to under- and over- clustering of the cell types of interest. To overcome these issues we present CellAssign, a probabilistic model that leverages prior knowledge of cell type marker genes to annotate scRNA-seq data into pre-defined and de novo cell types. CellAssign automates the process of assigning cells in a highly scalable manner across large datasets while simultaneously controlling for batch and patient effects. We demonstrate the analytical advantages of CellAssign through extensive simulations and exemplify real-world utility to profile the spatial dynamics of high-grade serous ovarian cancer and the temporal dynamics of follicular lymphoma. Our analysis reveals subclonal malignant phenotypes and points towards an evolutionary interplay between immune and cancer cell populations with cancer cells escaping immune recognition.
269 downloads genomics
Cleavage Under Targets and Release Using Nuclease (CUT&RUN) is an epigenomic profiling strategy in which antibody-targeted controlled cleavage by micrococcal nuclease releases specific protein-DNA complexes into the supernatant for paired-end DNA sequencing. As only the targeted fragments enter into solution, and the vast majority of DNA is left behind, CUT&RUN has exceptionally low background levels. CUT&RUN outperforms the most widely-used Chromatin Immunoprecipitation (ChIP) protocols in resolution, signal-to-noise, and depth of sequencing required. In contrast to ChIP, CUT&RUN is free of solubility and DNA accessibility artifacts and can be used to profile insoluble chromatin and to detect long-range 3D contacts without cross-linking. Here we present an improved CUT&RUN protocol that does not require isolation of nuclei and provides high-quality data starting with only 100 cells for a histone modification and 1000 cells for a transcription factor. From cells to purified DNA CUT&RUN requires less than a day at the lab bench.
269 downloads neuroscience
Alexandra Grubman, Gabriel Chew, John F Ouyang, Guizhi Sun, Xin Yi Choo, Catriona McLean, Rebecca Simmons, Sam Buckberry, Dulce Vargas-Landin, Jahnvi Pflueger, Ryan Lister, Owen Rackham, Enrico Petretto, Jose M Polo
Alzheimer's disease (AD) is a heterogeneous disease that is largely dependent on the complex cellular microenvironment in the brain. This complexity impedes our understanding of how individual cell types contribute to disease progression and outcome. To characterize the molecular and functional cell diversity in the human AD brain we utilized single nuclei RNA-seq in AD and control patient brains in order to map the landscape of cellular heterogeneity in AD. We detail gene expression changes at the level of cells and cell subclusters, highlighting specific cellular contributions to global gene expression patterns between control and Alzheimer's patient brains. We observed distinct cellular regulation of APOE which was repressed in oligodendrocyte progenitor cells (OPCs) and astrocyte AD subclusters, and highly enriched in a microglial AD subcluster. In addition, oligodendrocyte and microglia AD subclusters show discordant expression of APOE. Integration of transcription factor regulatory modules with downstream GWAS gene targets revealed subcluster-specific control of AD cell fate transitions. For example, this analysis uncovered that astrocyte diversity in AD was under the control of transcription factor EB (TFEB), a master regulator of lysosomal function and which initiated a regulatory cascade containing multiple AD GWAS genes. These results establish functional links between specific cellular sub-populations in AD, and provide new insights into the coordinated control of AD GWAS genes and their cell-type specific contribution to disease susceptibility. Finally, we created an interactive reference web resource which will facilitate brain and AD researchers to explore the molecular architecture of subtype and AD-specific cell identity, molecular and functional diversity at the single cell level.
269 downloads scientific communication and education
Good scientific writing is essential to career development and to the progress of science. A well-structured manuscript allows readers and reviewers to get excited about the subject matter, to understand and verify the paper's contributions, and to integrate these contributions into a broader context. However, many scientists struggle with producing high-quality manuscripts and typically get little training in paper writing. Focusing on how readers consume information, we present a set of 10 simple rules to help you get across the main idea of your paper. These rules are designed to make your paper more influential and the process of writing more efficient and pleasurable.
268 downloads immunology
The adaptive immune system is a dynamical, self-organized multiscale system that protects vertebrates from both pathogens and internal irregularities, such as tumours. For these reason it fascinates physicists, yet the multitude of different cells, molecules and sub-systems is often also petrifying. Despite this complexity, as experiments on different scales of the adaptive immune system become more quantitative, many physicists have made both theoretical and experimental contributions that help predict the behaviour of ensembles of cells and molecules that participate in an immune response. Here we review some recent contributions with an emphasis on quantitative questions and methodologies. We also provide a more general methods section that presents some of the wide array of theoretical tools used in the field.
268 downloads systems biology
Limiting post-meal glycemic response is an important factor in reducing the risk of chronic metabolic diseases, and contributes to significant health benefits in people with elevated levels of blood sugar. In this study, we collected gut microbiome activity (i.e., metatranscriptomic) data and measured the glycemic responses of 550 adults who consumed more than 27,000 meals from omnivore or vegetarian/gluten-free diets. We demonstrate that gut microbiome activity makes a statistically significant contribution to individual variation in glycemic response, in addition to anthropometric factors and the nutritional composition of foods. We describe a predictive model (multilevel mixed-effects regression) of variation in glycemic response among individuals ingesting the same foods. We introduce functional features aggregated from microbial activity data as candidates for association with mechanisms of glycemic control. In summary, we demonstrate for the first time that metatranscriptomic activity of the gut microbiome is correlated with glycemic response among adults.
268 downloads cell biology
Transmission of the Hedgehog signal across the plasma membrane by Smoothened is proposed to be triggered by its direct interaction with cholesterol. But how is cholesterol, an abundant lipid, regulated tightly enough to control a signaling system that can cause birth defects and cancer? Using toxin-based sensors that distinguish between distinct pools of cholesterol, we find here that Smoothened activation and Hedgehog signaling are driven by a biochemically defined fraction of membrane cholesterol, termed accessible cholesterol. Increasing accessible cholesterol levels by depletion of sphingomyelin, which sequesters cholesterol in complexes, potentiates Hedgehog signaling. By inactivating the transporter-like protein Patched 1, Hedgehog ligands trigger an increase in cholesterol accessibility in the ciliary membrane, the subcellular location for Smoothened signaling. Thus, compartmentalization of Hedgehog signaling in the primary cilium may allow cholesterol accessibility to be used as a second messenger to mediate the communication between Patched 1 and Smoothened, without causing collateral effects on other cellular processes.
268 downloads bioinformatics
Some forms of mild cognitive impairment (MCI) can be the clinical precursor of severe dementia like Alzheimer's disease (AD), while other types of MCI tend to remain stable over-time and do not progress to AD pathology. To choose an effective and personalized treatment for AD, we need to identify which MCI patients are at risk of developing AD and which are not. Here, we present a novel deep learning architecture, based on dual learning and an ad hoc layer for 3D separable convolutions, which aims at identifying those people with MCI who have a high likelihood of developing AD. Our deep learning procedures combine structural magnetic resonance imaging (MRI), demographic, neuropsychological, and APOe4 genotyping data as input measures. The most novel characteristics of our machine learning model compared to previous ones are as follows: 1) multi-tasking, in the sense that our deep learning model jointly learns to simultaneously predict both MCI to AD conversion, and AD vs healthy classification which facilitates the relevant feature extraction for prognostication; 2) the neural network classifier employs relatively few parameters compared to other deep learning architectures (we use ~500,000 network parameters, orders of magnitude lower than other network designs) without compromising network complexity and hence significantly limits data-overfitting; 3) both structural MRI images and warp field characteristics, which quantify the amount of volumetric change compared to the common template, were used as separate input streams to extract as much information as possible from the MRI data. All the analyses were performed on a subset of the Alzheimers Disease Neuroimaging Initiative (ADNI) database, for a total of n=785 participants (192 AD, 409 MCI, and184 healthy controls (HC)). We found that the most predictive combination of inputs included the structural MRI images and the demographic, neuropsychological, and APOe4 data, while the warp field metric added little predictive value. We achieved an area under the ROC curve (AUC) of 0.92 with a 10-fold cross-validated accuracy of 86%, a sensitivity of 87.5% and specificity of 85% in classifying MCI patients who developed AD in three years' time from those individuals showing stable MCI over the same time-period. To the best of our knowledge, this is the highest performance reported on a test set achieved in the literature using similar data. The same network provided an AUC of 1 and 100% accuracy, sensitivity and specificity when classifying NC from AD. We also demonstrated that our classification framework was robust to different co-registration templates and possibly irrelevant features / image sections. Our approach is flexible and can in principle integrate other imaging modalities, such as PET, and a more diverse group of clinical data. The convolutional framework is potentially applicable to any 3D image dataset and gives the flexibility to design a computer-aided diagnosis system targeting the prediction of any medical condition utilizing multi-modal imaging and tabular clinical data.
268 downloads genomics
Alternative splicing is central to metazoan gene regulation but the regulatory mechanisms involved are only partially understood. Here, we show that G-quadruplex (G4) motifs are enriched ~3-fold both upstream and downstream of splice junctions. Analysis of in vitro G4-seq data corroborates their formation potential. G4s display the highest enrichment at weaker splice sites, which are frequently involved in alternative splicing events. The importance of G4s in RNA as supposed to DNA is emphasized by a higher enrichment for the non-template strand. To explore if G4s are involved in dynamic alternative splicing responses, we analyzed RNA-seq data from mouse and human neuronal cells treated with potassium chloride. We find that G4s are enriched at exons which were skipped following potassium ion treatment. We validate the formation of stable G4s for three candidate splice sites by circular dichroism spectroscopy, UV-melting and fluorescence measurements. Finally, we explore G4 motifs across eleven representative species, and we observe that strong enrichment at splice sites is restricted to mammals and birds.
268 downloads bioinformatics
Single-cell sequencing provides a powerful approach for elucidating intratumor heterogeneity by resolving cell-to-cell variability. However, it also poses additional challenges including elevated error rates, allelic dropout and non-uniform coverage. A recently introduced single-cell-specific mutation detection algorithm leverages the evolutionary relationship between cells for denoising the data. However, due to its probabilistic nature, this method does not scale well with the number of cells. Here, we develop a novel combinatorial approach for utilizing the genealogical relationship of cells in detecting mutations from noisy single-cell sequencing data. Our method, called scVILP, jointly detects mutations in individual cells and reconstructs a perfect phylogeny among these cells. We employ a novel Integer Linear Program algorithm for deterministically and efficiently solving the joint inference problem. We show that scVILP achieves similar or better accuracy but significantly better runtime over existing methods on simulated data. We also applied scVILP to an empirical human cancer dataset from a high grade serous ovarian cancer patient.
- Top preprints of 2018
- Paper search
- Author leaderboards
- Overall metrics
- The API
- Email newsletter
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!