Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 73,584 bioRxiv papers from 320,262 authors.
Most downloaded bioRxiv papers, all time
72,396 results found. For more information, click each entry to expand.
4,703 downloads genetics
Philip R Jansen, Kyoko Watanabe, Sven Stringer, Nathan Skene, Julien Bryois, Anke R Hammerschlag, Christiaan A de Leeuw, Jeroen Benjamins, Ana B Muñoz-Manchado, Mats Nagel, Jeanne E Savage, Henning Tiemeier, Tonya White, The 23andMe Research Team, Joyce Y Tung, David Hinds, Vladimir Vacic, Patrick F Sullivan, Sophie van der Sluis, Tinca J.C. Polderman, August B Smit, Jens Hjerling-Leffler, Eus J.W. Van Someren, Danielle Posthuma
Insomnia is the second-most prevalent mental disorder, with no sufficient treatment available. Despite a substantial role of genetic factors, only a handful of genes have been implicated and insight into the associated neurobiological pathways remains limited. Here, we use an unprecedented large genetic association sample (N=1,331,010) to allow detection of a substantial number of genetic variants and gain insight into biological functions, cell types and tissues involved in insomnia. We identify 202 genome-wide significant loci implicating 956 genes through positional, eQTL and chromatin interaction mapping. We show involvement of the axonal part of neurons, of specific cortical and subcortical tissues, and of two specific cell-types in insomnia: striatal medium spiny neurons and hypothalamic neurons. These cell-types have been implicated previously in the regulation of reward processing, sleep and arousal in animal studies, but have never been genetically linked to insomnia in humans. We found weak genetic correlations with other sleep-related traits, but strong genetic correlations with psychiatric and metabolic traits. Mendelian randomization identified causal effects of insomnia on specific psychiatric and metabolic traits. Our findings reveal key brain areas and cells implicated in the neurobiology of insomnia and its related disorders, and provide novel targets for treatment.
4,696 downloads plant biology
Sreya Ghosh, Amy Watson, Oscar E. Gonzalez-Navarro, Ricardo H. Ramirez-Gonzalez, Luis Yanes, Marcela Mendoza-Suárez, James Simmonds, Rachel Wells, Tracey Rayner, Phon Green, Amber Hafeez, Sadiye Hayta, Rachel E. Melton, Andrew Steed, Abhimanyu Sarkar, Jeremy Carter, Lionel Perkins, John Lord, Mark Tester, Anne Osbourn, Matthew J. Moscou, P. Nicholson, Wendy A. Harwood, Cathie Martin, Claire Domoney, Cristobal Uauy, Brittany Hazard, Brande B. H. Wulff, Lee T. Hickey
To meet the challenge of feeding a growing population, breeders and scientists are continuously looking for ways to increase genetic gain in crop breeding. One way this can be achieved is through 'speed breeding' (SB), which shortens the breeding cycle and accelerates research studies through rapid generation advancement. The SB method can be carried out in a number of ways, one of which involves extending the duration of a plant's daily exposure to light (photoperiod) combined with early seed harvest in order to cycle quickly from seed to seed, thereby reducing the generation times for some long-day (LD) or day-neutral crops. Here we present glasshouse and growth chamber-based SB protocols with supporting data from experimentation with several crop species. These protocols describe the growing conditions, including soil media composition, lighting, temperature and spacing, which promote rapid growth of spring and winter bread wheat, durum wheat, barley, oat, various members of the Brassica family, chickpea, pea, grasspea, quinoa and the model grass Brachypodium distachyon. Points of flexibility within the protocols are highlighted, including how plant density can be increased to efficiently scale-up plant numbers for single seed descent (SSD) purposes. Conversely, instructions on how to perform SB on a small-scale by creating a benchtop SB growth cabinet that enables optimization of parameters at a low cost are provided. We also outline the procedure for harvesting and germinating premature wheat, barley and pea seed to reduce generation time. Finally, we provide troubleshooting suggestions to avoid potential pitfalls.
4,696 downloads genomics
Rachel E Gate, Christine S Cheng, Aviva P. Aiden, Atsede Siba, Marcin Tabaka, Dmytro Lituiev, Ido Machol, M. Grace Gordon, Meena Subramaniam, Muhammad Shamim, Kendrick L Hougen, Ivo Wortman, Su-Chen Huang, Neva C. Durand, Ting Feng, Philip L. De Jager, Howard Y. Chang, Erez Lieberman Aiden, Christophe Benoist, Michael A. Beer, Chun J Ye, Aviv Regev
Abstract Over 90% of genetic variants associated with complex human traits map to non-coding regions, but little is understood about how they modulate gene regulation in health and disease. One possible mechanism is that genetic variants affect the activity of one or more cis-regulatory elements leading to gene expression variation in specific cell types. To identify such cases, we analyzed Assay for Transposase-Accessible Chromatin sequencing (ATAC-seq) and RNA-seq profiles from activated CD4+ T cells of up to 105 healthy donors. We found that regions of accessible chromatin (ATAC-peaks) are co-accessible at kilobase and megabase resolution, in patterns consistent with the 3D organization of chromosomes measured by in situ Hi-C in T cells. 15% of genetic variants located within ATAC-peaks affected the accessibility of the corresponding peak through disrupting binding sites for transcription factors important for T cell differentiation and activation. These ATAC quantitative trait nucleotides (ATAC-QTNs) have the largest effects on co-accessible peaks, are associated with gene expression from the same aliquot of cells, are rarely affecting core binding motifs, and are enriched for autoimmune disease variants. Our results provide insights into how natural genetic variants modulate cis-regulatory elements, in isolation or in concert, to influence gene expression in primary immune cells that play a key role in many human diseases.
4,687 downloads genetics
Jeanne E Savage, Philip R Jansen, Sven Stringer, Kyoko Watanabe, Julien Bryois, Christiaan A de Leeuw, Mats Nagel, Swapnil Awasthi, Peter B. Bar, Jonathan R.I. Coleman, Katrina L. Grasby, Anke R Hammerschlag, Jakob Kaminski, Robert Karlsson, Eva Krapohl, Max Lam, Marianne Nygaard, Chandra A. Reynolds, Joey W. Trampush, Hannah Young, Delilah Zabaneh, Sara Hägg, Narelle K. Hansell, Ida K. Karlsson, Sten Linnarsson, Grant W. Montgomery, Ana B Muñoz-Manchado, Erin B. Quinlan, Gunter Schumann, Nathan Skene, Bradley T. Webb, Tonya White, Dan E. Arking, Deborah K. Attix, Dimitrios Avramopoulos, Robert M. Bilder, Panos Bitsios, Katherine E. Burdick, Tyrone D. Cannon, Ornit Chiba-Falek, Andrea Christoforou, Elizabeth T. Cirulli, Eliza Congdon, Aiden Corvin, Gail Davies, Ian J. Deary, Pamela DeRosse, Dwight Dickinson, Srdjan Djurovic, Gary Donohoe, Emily Drabant Conley, Johan G. Eriksson, Thomas Espeseth, Nelson A. Freimer, Stella Giakoumaki, Ina Giegling, Michael Gill, David C. Glahn, Ahmad R Hariri, Alex Hatzimanolis, Matthew C. Keller, Emma Knowles, Bettina Konte, Jari Lahti, Stephanie Le Hellard, Todd Lencz, David C Liewald, Edythe London, A.J. Lundervold, Anil K. Malhotra, Ingrid Melle, Derek Morris, Anna C. Need, William Ollier, Aarno Palotie, Antony Payton, Neil Pendleton, Russell A. Poldrack, Katri Räikkönen, Ivar Reinvang, Panos Roussos, Dan Rujescu, Fred W. Sabb, Matthew A. Scult, Olav B. Smeland, Nikolaos Smyrnis, John M. Starr, Vidar M. Steen, Nikos C. Stefanis, Richard E Straub, Kjetil Sundet, Aristotle N. Voineskos, Daniel R Weinberger, Elisabeth Widen, Jin Yu, Goncalo Abecasis, Ole A. Andreassen, Gerome Breen, Lene Christiansen, Birgit Debrabant, Danielle M. Dick, Andreas Heinz, Jens Hjerling Leffler, M. Arfan Ikram, Kenneth S Kendler, Nicholas G. Martin, Sarah E. Medland, Nancy L. Pedersen, R. Plomin, Tinca J.C. Polderman, Alkes L. Price, Sophie van der Sluis, Patrick F Sullivan, Henning Tiemeier, Scott I. Vrieze, Margaret J Wright, Danielle Posthuma
Intelligence is highly heritable and a major determinant of human health and well-being. Recent genome-wide meta-analyses have identified 24 genomic loci linked to intelligence, but much about its genetic underpinnings remains to be discovered. Here, we present the largest genetic association study of intelligence to date (N=279,930), identifying 206 genomic loci (191 novel) and implicating 1,041 genes (963 novel) via positional mapping, expression quantitative trait locus (eQTL) mapping, chromatin interaction mapping, and gene-based association analysis. We find enrichment of genetic effects in conserved and coding regions and identify 89 nonsynonymous exonic variants. Associated genes are strongly expressed in the brain and specifically in striatal medium spiny neurons and cortical and hippocampal pyramidal neurons. Gene-set analyses implicate pathways related to neurogenesis, neuron differentiation and synaptic structure. We confirm previous strong genetic correlations with several neuropsychiatric disorders, and Mendelian Randomization results suggest protective effects of intelligence for Alzheimer's dementia and ADHD, and bidirectional causation with strong pleiotropy for schizophrenia. These results are a major step forward in understanding the neurobiology of intelligence as well as genetically associated neuropsychiatric traits.
4,681 downloads genomics
The CCCTC-binding factor (CTCF) is widely regarded as a key player in chromosome organization in mammalian cells, yet direct assessment of the impact of loss of CTCF on genome architecture has been difficult due to its essential role in cell proliferation and early embryogenesis. Here, using auxin-inducible degron techniques to acutely deplete CTCF in mouse embryonic stem cells, we show that cell growth is severely slowed yet chromatin organization remains largely intact after loss of CTCF. Depletion of CTCF reduces interactions between chromatin loop anchors, diminishes occupancy of cohesin complex genome-wide, and slightly weakens topologically associating domain (TAD) structure, but the active and inactive chromatin compartments are maintained and the vast majority of TAD boundaries persist. Furthermore, transcriptional regulation and histone marks associated with enhancers are broadly unchanged upon CTCF depletion. Our results suggest CTCF-independent mechanisms in maintenance of chromatin organization.
4,680 downloads genetics
The debate over the ethnogenesis of Ashkenazi Jewry is longstanding, and has been hampered by a lack of Jewish historiographical work between the Biblical and the early Modern eras. Most historians, as well as geneticists, situate them as the descendants of Israelite tribes whose presence in Europe is owed to deportations during the Roman conquest of Palestine, as well as migration from Babylonia, and eventual settlement along the Rhine. By contrast, a few historians and other writers, most famously Arthur Koestler, have looked to migrations following the decline of the little-understood Medieval Jewish kingdom of Khazaria as the main source for Ashkenazi Jewry. A recent study of genetic variation in southeastern European populations (Elhaik 2012) also proposed a Khazarian origin for Ashkenazi Jews, eliciting considerable criticism from other scholars investigating Jewish ancestry who favor a Near Eastern origin of Ashkenazi populations. This paper re-examines the genetic data and analytical approaches used in these studies of Jewish ancestry, and situates them in the context of historical, linguistic, and archaeological evidence from the Caucasus, Europe and the Near East. Based on this reanalysis, it appears not only that the Khazar Hypothesis per se is without serious merit, but also the veracity of the Rhineland Hypothesis may also be questionable.
4,680 downloads genomics
Woolly mammoths and the living elephants are characterized by major phenotypic differences that allowed them to live in very different environments. To identify the genetic changes that underlie the suite of adaptations in woolly mammoths to life in extreme cold, we sequenced the nuclear genome from three Asian elephants and two woolly mammoths, identified and functionally annotated genetic changes unique to the woolly mammoth lineage. We find that genes with mammoth specific amino acid changes are enriched in functions related to circadian biology, skin and hair development and physiology, lipid metabolism, adipose development and physiology, and temperature sensation. Finally we resurrect and functionally test the mammoth and ancestral elephant TRPV3 gene, which encodes a temperature sensitive transient receptor potential (thermoTRP) channel involved in thermal sensation and hair growth, and show that a single mammoth-specific amino acid substitution in an otherwise highly conserved region of the TRPV3 channel strongly affected its temperature sensitivity. Our results have identified a set of genetic changes that likely played important roles in the adaptation of woolly mammoths to life in the high artic.
4,677 downloads biochemistry
The RNA-guided CRISPR-Cas9 nuclease from Streptococcus pyogenes (SpCas9) has been widely repurposed for genome editing. High-fidelity (SpCas9-HF1) and enhanced specificity (eSpCas9(1.1)) variants exhibit substantially reduced off-target cleavage in human cells, but the mechanism of target discrimination and the potential to further improve fidelity were unknown. Using single-molecule Förster resonance energy transfer (smFRET) experiments, we show that both SpCas9-HF1 and eSpCas9(1.1) are trapped in an inactive state when bound to mismatched targets. We find that a non-catalytic domain within Cas9, REC3, recognizes target mismatches and governs the HNH nuclease to regulate overall catalytic competence. Exploiting this observation, we identified residues within REC3 involved in mismatch sensing and designed a new hyper-accurate Cas9 variant (HypaCas9) that retains robust on-target activity in human cells. These results offer a more comprehensive model to rationalize and modify the balance between target recognition and nuclease activation for precision genome editing.
4,671 downloads cell biology
Small molecule fluorophores are important tools for advanced imaging experiments. The development of self-labeling protein tags such as the HaloTag and SNAP-tag has expanded the utility of chemical dyes in live-cell microscopy. We recently described a general method for improving the brightness and photostability of small, cell-permeable fluorophores, resulting in the azetidine-containing 'Janelia Fluor' (JF) dyes. Here, we refine and extend the utility of the JF dyes by synthesizing photoactivatable derivatives that are compatible with established live cell labeling strategies. These compounds retain the superior brightness of the JF dyes but their facile photoactivation enables improved single-particle tracking and localization microscopy experiments.
4,665 downloads bioinformatics
As single-cell transcriptomics becomes a mainstream technology, the natural next step is to integrate the accumulating data in order to achieve a common ontology of cell types and states. However, owing to various nuisance factors of variation, it is not straightforward how to compare gene expression levels across data sets and how to automatically assign cell type labels in a new data set based on existing annotations. In this manuscript, we demonstrate that our previously developed method, scVI, provides an effective and fully probabilistic approach for joint representation and analysis of cohorts of single-cell RNA-seq data sets, while accounting for uncertainty caused by biological and measurement noise. We also introduce single-cell ANnotation using Variational Inference (scANVI), a semi-supervised variant of scVI designed to leverage any available cell state annotations --- for instance when only one data set in a cohort is annotated, or when only a few cells in a single data set can be labeled using marker genes. We demonstrate that scVI and scANVI compare favorably to the existing methods for data integration and cell state annotation in terms of accuracy, scalability, and adaptability to challenging settings such as a hierarchical structure of cell state labels. We further show that different from existing methods, scVI and scANVI represent the integrated datasets with a single generative model that can be directly used for any probabilistic decision making task, using differential expression as our case study. scVI and scANVI are available as open source software and can be readily used to facilitate cell state annotation and help ensure consistency and reproducibility across studies.
4,662 downloads neuroscience
We propose a novel approach based on modern deep artificial neural networks (DNNs) for understanding how the morpho-electrical complexity of neurons shapes their input/output (I/O) properties at the millisecond resolution in response to massive synaptic input. The I/O of integrate and fire point neuron is accurately captured by a DNN with a single unit and one hidden layer. A fully connected DNN with one hidden layer faithfully replicated the I/O relationship of a detailed model of Layer 5 cortical pyramidal cell (L5PC) receiving AMPA and GABAA synapses. However, when adding voltage-gated NMDA-conductances, a temporally-convolutional DNN with seven layers was required. Analysis of the DNN filters provides new insights into dendritic processing shaping the I/O properties of neurons. This work proposes a systematic approach for characterizing the functional "depth" of a biological neurons, suggesting that cortical pyramidal neurons and the networks they form are computationally much more powerful than previously assumed.
4,659 downloads plant biology
Because global climate change has made agricultural supply unstable, plant factories are expected to be a safe and stable means of food production. As the light source of a plant factory or controlled greenhouse, the light emitting diode (LED) is expected to solve cost problems and promote plant growth efficiently. In this study, we examined the light condition created by using monochromatic red and blue LEDs, to provide both simultaneous and alternating irradiation to leaf lettuce. The result was that simultaneous red and blue irradiation promoted plant growth more effectively than monochromatic and fluorescent light irradiation. Moreover, alternating red and blue light accelerated plant growth significantly even when the total light intensity per day was the same as with simultaneous irradiation. The fresh weight in altering irradiation was almost two times higher than with fluorescent light and about 1.6 times higher than with simultaneous irradiation. The growth-promoting effect of alternating irradiation of red and blue light was observed in different cultivars. From the results of experiments, we offer a novel plant growth method named "Shigyo Method", the core concept of which is the alternating irradiation of red and blue light.
4,658 downloads genomics
A. Raveane, S. Aneli, F. Montinaro, G. Athanasiadis, S. Barlera, G. Birolo, G. Boncoraglio, AM. Di Blasio, C. Di Gaetano, L. Pagani, S. Parolo, P. Paschou, A. Piazza, G. Stamatoyannopoulos, A. Angius, N. Brucato, F. Cucca, G. Hellenthal, A. Mulas, M. Peyret-Guzzon, M. Zoledziewska, A. Baali, C. Bycroft, M. Cherkaoui, C. Dina, JM. Dugoujon, P. Galan, J. Giemza, T. Kivisild, M. Melhaoui, M. Metspalu, S. Myers, LM. Pereira, FX. Ricaut, F. Brisighelli, I. Cardinali, V. Grugni, H. Lancioni, V.L. Pascali, A. Torroni, O. Semino, G. Matullo, A. Achilli, A. Olivieri, C. Capelli
European populations display low genetic diversity as the result of long term blending of the small number of ancient founding ancestries. However it is still unclear how the combination of ancient ancestries related to early European foragers, Neolithic farmers and Bronze Age nomadic pastoralists can fully explain genetic variation across Europe. Populations in natural crossroads like the Italian peninsula are expected to recapitulate the overall continental diversity, but to date have been systematically understudied. Here we characterised the ancestry profiles of modern-day Italian populations using a genome-wide dataset representative of modern and ancient samples from across Italy, Europe and the rest of the world. Italian genomes captured several ancient signatures, including a non-steppe related substantial ancestry contribution ultimately from the Caucasus. Differences in ancestry composition as the result of migration and admixture generated in Italy the largest degree of population structure detected so far in the continent and shaped the amount of Neanderthal DNA present in modern-day populations.
4,657 downloads bioinformatics
The recent rapid spread of single cell RNA sequencing (scRNA-seq) methods has created a large variety of experimental and computational pipelines for which best practices have not been established, yet. Here, we use simulations based on five scRNA-seq library protocols in combination with nine realistic differential expression (DE) setups to systematically evaluate three mapping, four imputation, seven normalisation and four differential expression testing approaches resulting in ∼ 3,000 pipelines, allowing us to also assess interactions among pipeline steps. We find that choices of normalisation and library preparation protocols have the biggest impact on scRNA-seq analyses. Specifically, we find that library preparation determines the ability to detect symmetric expression differences, while normalisation dominates pipeline performance in asymmetric DE-setups. Finally, we illustrate the importance of informed choices by showing that a good scRNA-seq pipeline can have the same impact on detecting a biological signal as quadrupling the sample size.
4,652 downloads bioinformatics
Hyun Min Kang, Meena Subramaniam, Sasha Targ, Michelle Nguyen, Lenka Maliskova, Eunice Wan, Simon Wong, Lauren Byrnes, Cristina Lanata, Rachel Gate, Sara Mostafavi, Alexander Marson, Noah Zaitlen, Lindsey A. Criswell, Jimmie Ye
Droplet-based single-cell RNA-sequencing (dscRNA-seq) has enabled rapid, massively parallel profiling of transcriptomes from tens of thousands of cells. Multiplexing samples for single cell capture and library preparation in dscRNA-seq would enable cost-effective designs of differential expression and genetic studies while avoiding technical batch effects, but its implementation remains challenging. Here, we introduce an in-silico algorithm demuxlet that harnesses natural genetic variation to discover the sample identity of each cell and identify droplets containing two cells. These capabilities enable multiplexed dscRNA-seq experiments where cells from unrelated individuals are pooled and captured at higher throughput than standard workflows. To demonstrate the performance of demuxlet, we sequenced 3 pools of peripheral blood mononuclear cells (PBMCs) from 8 lupus patients. Given genotyping data for each individual, demuxlet correctly recovered the sample identity of > 99% of singlets, and identified doublets at rates consistent with previous estimates. In PBMCs, we demonstrate the utility of multiplexed dscRNA-seq in two applications: characterizing cell type specificity and inter-individual variability of cytokine response from 8 lupus patients and mapping genetic variants associated with cell type specific gene expression from 23 donors. Demuxlet is fast, accurate, scalable and could be extended to other single cell datasets that incorporate natural or synthetic DNA barcodes.
4,648 downloads genomics
In criminal and civil investigations, postmortem interval is used as evidence to help sort out circumstances at the time of human death. Many biological, chemical, and physical indicators can be used to determine the postmortem interval, but most are not accurate. Here, we sought to validate an experimental design to accurately predict the time of death by analyzing the expression of hundreds of upregulated genes in two model organisms, the zebrafish and mouse. In a previous study, the death of healthy adults was conducted under strictly controlled conditions to minimize the effects of confounding factors such as lifestyle and temperature. A total of 74,179 microarray probes were calibrated using the Gene Meter approach and the transcriptional profiles of 1,063 significantly upregulated genes were assembled into a time series spanning from life to 48 or 96 h postmortem. In this study, the experimental design involved splitting the gene profiles into training and testing datasets, randomly selecting groups of profiles, determining the modeling parameters of the genes to postmortem time using over- and/or perfectly- defined linear regression analyses, and calculating the fit (R2) and slope of predicted versus actual postmortem times. This design was repeated several thousand to million times to find the top predictive groups of gene transcription profiles. A group of eleven zebrafish genes yielded R2 of 1 and a slope of 0.99, while a group of seven mouse liver genes yielded a R2 of 0.98 and a slope of 0.97, and seven mouse brain genes yielded a R2 of 0.93 and a slope of 0.85. In all cases, groups of gene transcripts yielded better postmortem time predictions than individual gene transcripts. The significance of this study is two-fold: selected groups of upregulated genes provide accurate prediction of postmortem time, and the successfully validated experimental design can now be used to accurately predict postmortem time in cadavers.
4,648 downloads neuroscience
Michael N Economo, Sarada Viswanathan, Bosiljka Tasic, Erhan Bas, Johan Winnubst, Vilas Menon, Lucas T Graybuck, Thuc Nghi Nguyen, Lihua Wang, Charles R. Gerfen, Jayaram Chandrashekar, Hongkui Zeng, Loren L. Looger, Karel Svoboda
Activity in motor cortex predicts specific movements, seconds before they are initiated. This preparatory activity has been observed in L5 descending "pyramidal tract" (PT) neurons. A key question is how preparatory activity can be maintained without causing movement, and how preparatory activity is eventually converted to a motor command to trigger appropriate movements. We used single cell transcriptional profiling and axonal reconstructions to identify two types of PT neuron. Both types share projections to multiple targets in the basal ganglia and brainstem. One type projects to thalamic regions that connect back to motor cortex. In a delayed-response task, these neurons produced early preparatory activity that persisted until the movement. The second type projects to motor centers in the medulla and produced late preparatory activity and motor commands. These results indicate that two motor cortex output neurons are specialized for distinct roles in motor control.
4,640 downloads neuroscience
When a neuron is driven beyond its threshold it spikes, and the fact that it does not communicate its continuous membrane potential is usually seen as a computational liability. Here we show that this spiking mechanism allows neurons to produce an unbiased estimate of their causal influence, and a way of approximating gradient descent learning. Importantly, neither activity of upstream neurons, which act as confounders, nor downstream non-linearities bias the results. By introducing a local discontinuity with respect to their input drive, we show how spiking enables neurons to solve causal estimation and learning problems.
4,638 downloads bioinformatics
The CRISPR/Cas9 system provides unprecedented genome editing capabilities; however, several facets of this system are under investigation for further characterization and optimization, including the choice of guide RNA that directs Cas9 to target DNA. In particular, given that one would like to target the protein-coding region of a gene, hundreds of guides satisfy the basic constraints of the CRISPR/Cas9 Protospacer Adjacent Motif sequence (PAM); however, not all of these guides actually generate gene knockouts with equal efficiency. Leveraging a broad set of experimental measurements of guide knockout efficiency, we introduce a state-of-the art in silico modeling approach to identify guides that will lead to more effective gene knockout. We first investigated which guide and gene features are critical for prediction (e.g., single- and di-nucleotide identity of the gene target), which are helpful (e.g., thermodynamics), and which are predictive but redundant (e.g., microhomology). We also investigated evaluation measures for comparing predictive models in the present context, suggesting that Area Under the Receiver Operating Curve is not ideal. Finally, we explored a variety of different model classes and found that use of gradient-boosted regression trees produced the best predictive performance. Pointers to our open-source software, code, and prediction server will be available at http://research.microsoft.com/en-us/projects/azimuth.
4,637 downloads cancer biology
Cancer is not solely a disease of the genome, but is a systemic disease that affects the host on many functional levels, including, and perhaps most notably, the function of the immune response, resulting in both tumor-promoting inflammation and tumor-inhibiting cytotoxic action. The dichotomous actions of the immune response induce significant variations in tumor growth dynamics that mathematical modeling can help to understand. Here we present a general method using ordinary differential equations (ODEs) to model and analyze cancer-immune interactions, and in particular, immune-induced tumor dormancy.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!