Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 57,349 bioRxiv papers from 264,093 authors.
Most downloaded bioRxiv papers, since beginning of last month
55,922 results found. For more information, click each entry to expand.
379 downloads cancer biology
Russell C. Rockne, Sergio Branciamore, Jing Qi, David Frankhouser, Denis O'Meally, Wei-Kai Hua, Guerry J. Cook, Lianjun Zhang, Emily Carnahan, Ayelet Marom, Herman Wu, Davide Maestrini, Xiwei Wu, Yate-Ching Yuan, Zheng Liu, Leo D. Wang, Stephen Forman, Nadia Carlesso, Ya-Huei Kuo, Guido Marcucci
Temporal dynamics of gene expression are informative of changes associated with disease development and evolution. Given the complexity of high-dimensional temporal datasets, an analytical framework guided by a robust theory is needed to interpret time-sequential changes and to predict system dynamics. Herein, we use acute myeloid leukemia as a proof-of-principle to model gene expression dynamics in a transcriptome state-space constructed based on time-sequential RNA-sequencing data. We describe the construction of a state-transition model to identify state-transition critical points which accurately predicts leukemia development. We show an analytical approach based on state-transition critical points identified step-wise transcriptomic perturbations driving leukemia progression. Furthermore, the gene(s) trajectory and geometry of the transcriptome state-space provides biologically-relevant gene expression signals that are not synchronized in time, and allows quantification of gene(s) contribution to leukemia development. Therefore, our state-transition model can synthesize information, identify critical points to guide interpretation of transcriptome trajectories and predict disease development.
379 downloads neuroscience
Rebecca D Hodge, Trygve E Bakken, Jeremy A Miller, Kimberly A Smith, Eliza R Barkan, Lucas T Graybuck, Jennie L Close, Brian Long, Osnat Penn, Zizhen Yao, Jeroen Eggermont, Thomas Hollt, Boaz P Levi, Soraya I Shehata, Brian Aevermann, Allison Beller, Darren Bertagnolli, Krissy Brouner, Tamara Casper, Charles Cobbs, Rachel Dalley, Nick Dee, Song-Lin Ding, Richard G Ellenbogen, Olivia Fong, Emma Garren, Jeff Goldy, Ryder P Gwinn, Daniel Hirschstein, C Dirk Keene, Mohamed Keshk, Andrew L Ko, Kanan Lathia, Ahmed Mahfouz, Zoe Maltzer, Medea McGraw, Thuc Nghi Nguyen, Julie Nyhus, Jeffrey G Ojemann, Aaron Oldre, Sheana Parry, Shannon Reynolds, Christine Rimorin, Nadiya V Shapovalova, Saroja Somasundaram, Aaron Szafer, Elliot R Thomsen, Michael Tieu, Richard H Scheuermann, Rafael Yuste, Susan M Sunkin, Boudewijn Lelieveldt, David Feng, Lydia Ng, Amy Bernard, Michael Hawrylycz, John Phillips, Bosiljka Tasic, Hongkui Zeng, Allan R Jones, Christof Koch, Ed S Lein
Elucidating the cellular architecture of the human neocortex is central to understanding our cognitive abilities and susceptibility to disease. Here we applied single nucleus RNA-sequencing to perform a comprehensive analysis of cell types in the middle temporal gyrus of human cerebral cortex. We identify a highly diverse set of excitatory and inhibitory neuronal types that are mostly sparse, with excitatory types being less layer-restricted than expected. Comparison to a similar mouse cortex single cell RNA-sequencing dataset revealed a surprisingly well-conserved cellular architecture that enables matching of homologous types and predictions of human cell type properties. Despite this general conservation, we also find extensive differences between homologous human and mouse cell types, including dramatic alterations in proportions, laminar distributions, gene expression, and morphology. These species-specific features emphasize the importance of directly studying human brain.
379 downloads genomics
Large-scale genetic screens play a key role in the systematic discovery of genes underlying cellular phenotypes. Pooling of genetic perturbations greatly increases screening throughput, but has so far been limited to screens of enrichments defined by cell fitness and flow cytometry, or to comparatively low-throughput single cell gene expression profiles. Although microscopy is a rich source of spatial and temporal information about mammalian cells, high-content imaging screens have been restricted to much less efficient arrayed formats. Here, we introduce an optical method to link perturbations and their phenotypic outcomes at the single-cell level in a pooled setting. Barcoded perturbations are read out by targeted in situ sequencing following image-based phenotyping. We apply this technology to screen a focused set of 952 genes across >3 million cells for involvement in NF-κB activation by imaging the translocation of RelA (p65) to the nucleus, recovering 20 known pathway components and 3 novel candidate positive regulators of IL-1β and TNFα-stimulated immune responses.
378 downloads microbiology
Probiotic candidate L. reuteri was screened out for in vivo experiments based on a relatively higher gastrointestinal tolerance and moderate adhesiveness. As results shown in in-vivo experiments, a significantly higher level of IL-12 at low-dose group was found both in females and males. Higher levels of T-lymphocytes were also observed in females compared to control group, however, males displayed a reduction expcept for CD8-positive cells in ileum. In comparison to the control group, the relative abundance of phylotypes in the phylum Bacteroidetes (genus of Bacteroides, Prevotella) and Firmicutes (genus of ClostridiumIV) exihibited a reserve shift between sexes after L. reuteri intervened. Meanwhile, the relative abundance of several taxa (Acetobacteroides, Lactobcaillus, bacillus) also differed markedly in sexes at low-dose group, together with microbiota diversity, as indicated by Shannon index.
378 downloads cell biology
Genome stability relies on proper coordination of mitosis and cytokinesis, where dynamic microtubules capture and faithfully segregate chromosomes into daughter cells. The role of long noncoding RNAs (lncRNAs) in controlling these processes however remains largely unexplored. To identify lncRNAs with mitotic functions, we performed a high-content RNAi imaging screen targeting more than 2,000 human lncRNAs. By investigating major hallmarks of cell division such as chromosome segregation, mitotic duration and cytokinesis, we discovered numerous lncRNAs with functions in each of these processes. The chromatin-associated lncRNA, linc00899, was selected for in-depth studies due to the robust mitotic delay observed upon its depletion. Transcriptome analysis of linc00899-depleted cells together with gain-of-function and rescue experiments across multiple cell types identified the neuronal microtubule-binding protein, TPPP/p25, as a target of linc00899. Linc00899 binds the genomic locus of TPPP/p25 and suppresses its transcription through a cis-acting mechanism. In cells depleted of linc00899, the consequent upregulation of TPPP/p25 alters microtubule dynamics and is necessary and sufficient to delay mitosis. Overall, our comprehensive screen identified several lncRNAs with roles in genome stability and revealed a new lncRNA that controls microtubule behaviour with functional implications beyond cell division.
378 downloads genetics
F. Kyle Satterstrom, Jack A. Kosmicki, Jiebiao Wang, Michael S. Breen, Silvia De Rubeis, Joon-Yong An, Minshi Peng, Ryan Lewis Collins, Jakob Grove, Lambertus Klei, Christine Stevens, Jennifer Reichert, Maureen Mulhern, Mykyta Artomov, Sherif Gerges, Brooke Sheppard, Xinyi Xu, Aparna Bhaduri, Utku Norman, Harrison Brand, Grace Schwartz, Rachel Nguyen, Elizabeth Guerrero, Caroline Dias, Branko Aleksic, Richard Anney, Mafalda Barbosa, Somer Bishop, Alfredo Brusco, Jonas Bybjerg-Grauholm, Angel Carracedo, Marcus C. Y. Chan, Andreas Chiocchetti, Brian Chung, Hilary Coon, Michael Cuccaro, Aurora Curró, Bernardo Dalla Bernardina, Ryan Doan, Enrico Domenici, Shan Dong, Chiara Fallerini, Montserrat Fernández-Prieto, Giovanni Battista Ferrero, Christine M. Freitag, Menachem Fromer, J. Jay Gargus, Daniel Geschwind, Elisa Giorgio, Javier González-Peñas, Stephen Guter, Danielle Halpern, Emily Hassen-Kiss, Xin He, Gail Herman, Irva Hertz-Picciotto, David M Hougaard, Christina M Hultman, Iuliana Ionita-Laza, Suma Jacob, Jesslyn Jamison, Astanand Jugessur, Miia Kaartinen, Gun Peggy Knudsen, Alexander Kolevzon, Itaru Kushima, So Lun Lee, Terho Lehtimäki, Elaine T Lim, Carla Lintas, W. Ian Lipkin, Diego Lopergolo, Fátima Lopes, Yunin Ludena, Patricia Maciel, Per Magnus, Behrang Mahjani, Nell Maltman, Dara S Manoach, Gal Meiri, Idan Menashe, Judith Miller, Nancy Minshew, Eduarda Montenegro M. de Souza, Danielle Moreira, Eric M Morrow, Ole Mors, Preben Bo Mortensen, Matthew Mosconi, Pierandrea Muglia, Benjamin Neale, Merete Nordentoft, Norio Ozaki, Aarno Palotie, Mara Parellada, Maria Rita Passos-Bueno, Margaret Pericak-Vance, Antonio Persico, Isaac Pessah, Kaija Puura, Abraham Reichenberg, Alessandra Renieri, Evelise Riberi, Elise Robinson, Kaitlin E. Samocha, Sven Sandin, Susan L Santangelo, Gerry Schellenberg, Stephen Scherer, Sabine Schlitt, Rebecca Schmidt, Lauren Schmitt, Isabela Maya W. Silva, Tarjinder Singh, Paige Siper, Moyra Smith, Gabriela Soares, Camilla Stoltenberg, Pål Suren, Ezra Susser, John Sweeney, Peter Szatmari, Lara Tang, Flora Tassone, Karoline Teufel, Elisabetta Trabetti, Maria del Pilar Trelles, Christopher Walsh, Lauren Weiss, Thomas Werge, Donna Werling, Emilie M. Wigdor, Emma Wilkinson, Jeremy A Willsey, Timothy Yu, Mullin H.C. Yu, Ryan Yuen, Elaine Zachi, Catalina Betancur, Edwin H. Cook, Louise Gallagher, Michael Gill, Thomas Lehner, Geetha Senthil, James S Sutcliffe, Audrey Thurm, Michael E. Zwick, Anders D. Børglum, Matthew W State, A. Ercument Cicek, Michael E. Talkowski, David J. Cutler, Bernie Devlin, Stephan Sanders, Kathryn Roeder, Joseph D. Buxbaum, Mark J. Daly
We present the largest exome sequencing study of autism spectrum disorder (ASD) to date (n=35,584 total samples, 11,986 with ASD). Using an enhanced Bayesian framework to integrate de novo and case-control rare variation, we identify 102 risk genes at a false discovery rate ≤ 0.1. Of these genes, 49 show higher frequencies of disruptive de novo variants in individuals ascertained for severe neurodevelopmental delay, while 53 show higher frequencies in individuals ascertained for ASD; comparing ASD cases with mutations in these groups reveals phenotypic differences. Expressed early in brain development, most of the risk genes have roles in regulation of gene expression or neuronal communication (i.e., mutations effect neurodevelopmental and neurophysiological changes), and 13 fall within loci recurrently hit by copy number variants. In human cortex single-cell gene expression data, expression of risk genes is enriched in both excitatory and inhibitory neuronal lineages, consistent with multiple paths to an excitatory/inhibitory imbalance underlying ASD.
378 downloads bioinformatics
Targeted optimizing of existing DNA sequences for useful properties, has the potential to enable several synthetic biology applications from modifying DNA to treat genetic disorders to designing regulatory elements to fine tune context-specific gene expression. Current approaches for targeted genome editing are largely based on prior biological knowledge or ad-hoc rules. Few if any machine learning approaches exist for targeted optimization of regulatory DNA sequences. Here, we propose a novel generative neural network architecture for targeted DNA sequence editing - the EDA architecture - consisting of an encoder, decoder, and analyzer. We showcase the use of EDA to optimize regulatory DNA sequences to bind to the transcription factor SPI1. Compared to other state-of-the-art approaches such as a textual variational autoencoder and rule-based editing, EDA significantly improves predicted binding of SPI1 of genomic sequences with the minimal set of edits. We also use EDA to design regulatory elements with optimized grammars of CREB1 binding sites that can tune reporter expression levels as measured by massively parallel reporter assays (MPRA). We analyze the properties of the binding sites in the edited sequences and find patterns that are consistent with previously reported grammatical rules which tie gene expression to CRE binding site density, spacing and affinity.
378 downloads cancer biology
Loss-of-function (LoF) screenings have the potential to reveal novel cancer-specific vulnerabilities, prioritize drug treatments, and inform precision medicine therapeutics. These screenings were traditionally done using shRNAs, but with the recent emergence of CRISPR technology there has been a shift in methodology. However, recent analyses have found large inconsistencies between CRISPR and shRNA essentiality results. Here, we examined the DepMap project, the largest cancer LoF effort undertaken to date, and find a lack of correlation between CRISPR and shRNA LoF results; we further characterized differences between genes found to be essential by either platform. We then introduce ECLIPSE, a machine learning approach, which combines genomic, cell line, and experimental design features to predict essential genes and platform specific essential genes in specific cancer cell lines. We applied ECLIPSE to known drug targets and found that our approach strongly differentiated drugs approved for cancer versus those that have not, and can thus be leveraged to identify potential cancer repurposing opportunities. Overall, ECLIPSE allows for a more comprehensive analysis of gene essentiality and drug development; which neither platform can achieve alone.
377 downloads neuroscience
In mammalian visual cortex, neural tuning to stimulus orientation is organized in either columnar or salt-and-pepper patterns across species. This is often considered to reflect disparate mechanisms of cortical development across mammalian taxa. However, it is unknown whether different cortical architectures are generated by species-specific mechanisms, or simply originate from the variation of biological parameters within a universal principle of development. We analysed neural parameters in eight mammalian species and found that cortical organization is predictable by a single factor: the retino-cortical mapping ratio. We show that a Nyquist sampling model explains parametric division of the patterns with high accuracy and that simulations of controlled mapping conditions reproduce both types of organization. Our results explain the origin of distinct cortical circuits under a universal development process.
377 downloads biochemistry
The accumulation of protein deposits in neurodegenerative diseases involves the presence of a metastable subproteome vulnerable to aggregation. To investigate this subproteome and the mechanisms that regulates it, we measured the proteome solubility of the Neuro2a cell line under protein homeostasis stresses induced by Huntington Disease proteotoxicity; Hsp70, Hsp90, proteasome and ER-mediated folding inhibition; and oxidative stress. We found one-quarter of the proteome extensively changed solubility. Remarkably, almost all the increases in insolubility were counteracted by increases in solubility of other proteins. Each stress directed a highly specific pattern of change, which reflected the remodelling of protein complexes involved in adaptation to perturbation, most notably stress granule proteins, which responded differently to different stresses. These results indicate that the robustness of protein homeostasis relies on the absence of proteins highly vulnerable to aggregation and on large changes in aggregation state of regulatory mechanisms that restore protein solubility upon specific perturbations.
377 downloads neuroscience
Translational control of memory processes is a tightly regulated process where the coordinated interaction and modulation of translation factors provides a permissive environment for protein synthesis during memory formation. Existing methods used to block translation lack the spatiotemporal precision to investigate cell specific contributions to consolidation of long-term memories. Here, we have developed a novel chemogenetic mouse resource for cell type-specific and drug-inducible protein synthesis inhibition (ciPSI) that utilizes an engineered version of the catalytic kinase domain of dsRNA-activated protein (PKR). ciPSI allows rapid and reversible phosphorylation of eIF2α causing a block on general translation by 50% in vivo. Using this resource, we discovered that temporally structured pan-neuronal protein synthesis is required for consolidation of long term auditory threat memory. Targeted protein synthesis inhibition in CamK2α expressing glutamatergic neurons in lateral amygdala (LA) impaired long-term memory, which was recovered with artificial chemogenetic reactivation at the cost of stimulus generalization. Conversely, genetically reducing phosphorylation of eIF2α in CamK2α positive neurons in LA enhanced memory strength, but was accompanied with reduced memory fidelity and behavior inflexibility. Our findings provide evidence for a finely tuned translation program during consolidation of long-term threat memories.
376 downloads genetics
Mendelian randomization (MR) is a valuable tool for detecting evidence of causal relationships between pairs of traits. Opportunities to apply MR are growing rapidly as the number of genome-wide association studies (GWAS) with publicly available summary statistics grows. Unfortunately, existing MR methods are prone to false positives caused by pleiotropic variants. Correlated pleiotropy, which arises when genetic variants affect both traits through a heritable shared factor, is a particularly challenging problem and is not addressed by most existing methods. Additionally, most MR methods only use genome-wide significant loci, which can limit power and introduce bias. We propose a new method (Causal Analysis Using Summary Effect Estimates; CAUSE) that uses genome-wide summary statistics to identify patterns that are consistent with causal effects, while accounting for pleiotropic effects, including correlated pleiotropy. We demonstrate in simulations that CAUSE is much better at controlling false positive rate in the presence of pleiotropic effects than other methods. We apply CAUSE to study relationships between pairs of complex traits and between blood cell composition and autoimmune disorders. We find that CAUSE detects causal relationships with strong literature support, including an effect of blood pressure on heart disease risk that is not found using other methods. Our results suggest that many pairs of traits identified as causal using alternative methods may be false positives driven by pleiotropic effects.
375 downloads neuroscience
Jing Ren, Alina Isakova, Drew Friedmann, Jiawei Zeng, Sophie Grutzner, Albert Pun, Grace Q Zhao, Sai Saroja Kolluru, Ruiyu Wang, Rui Lin, Pengcheng Li, Anan Li, Jennifer L Raymond, Qingming Luo, Minmin Luo, Stephen R. Quake, Liqun Luo
Serotonin neurons of the dorsal and medial raphe nuclei (DR and MR) collectively innervate the entire forebrain and midbrain, modulating diverse physiology and behavior. To gain a fundamental understanding of their molecular heterogeneity, we used plate-based single-cell RNA-sequencing to generate a comprehensive dataset comprising eleven transcriptomically distinct serotonin neuron clusters. Systematic in situ hybridization mapped specific clusters to the principal DR, caudal DR, or MR. These transcriptomic clusters differentially express a rich repertoire of neuropeptides, receptors, ion channels, and transcription factors. We generated novel intersectional viral-genetic tools to access specific subpopulations. Whole-brain axonal projection mapping revealed that DR serotonin neurons co-expressing vesicular glutamate transporter-3 preferentially innervate the cortex, whereas those co-expressing thyrotropin-releasing hormone innervate subcortical regions in particular the hypothalamus. Reconstruction of 50 individual DR serotonin neurons revealed segregated axonal projection patterns at the single-cell level. Together, these results provide a molecular foundation of the heterogenous serotonin neuronal phenotypes.
375 downloads synthetic biology
Synthetic genetic circuits allow us to modify the behavior of living cells. However, changes in environmental conditions and unforeseen interactions between a circuit and the host cell can cause deviations from a desired function, resulting in the need for time-consuming physical re-assembly to fix these issues. Here, we use a regulatory motif controlling transcription and translation to create genetic devices whose response functions can be dynamically tuned. This approach allows us, after assembly, to shift the on and off states of a sensor by 4.5- and 28-fold, respectively, and modify a genetic NOT gate to allow its transition from an on to off state to be varied over a 7-fold range. In both cases, "tuning" leads to trade-offs in the fold-change and separation between the distributions of cells in on and off states. By using mathematical modelling, we derive design principles that are used to further optimize these devices. This work lays the foundation for adaptive genetic circuits that can be tuned after their physical assembly to maintain functionality across diverse environments and design contexts.
375 downloads cell biology
A major component of cell migration is F-actin polymerization driven membrane protrusion in the front. However, F-actin proximal to the plasma membrane also has a scaffolding role to support and attach the membrane. Here we developed a fluorescent reporter to monitor changes in the density of membrane proximal F-actin during membrane protrusion and cell migration. Strikingly, unlike total F-actin concentration, which is high in the front of migrating cells, the density of membrane proximal F-actin is low in the front and high in the back. Furthermore, local membrane protrusions only form following local decreases in membrane proximal F-actin density. Our study suggests that low density of membrane proximal F-actin is a fundamental structural parameter that locally directs membrane protrusions and globally stabilizes cell polarization during cell migration.
374 downloads genetics
Cassandra N. Spracklen, Momoko Horikoshi, Young Jin Kim, Kuang Lin, Fiona Bragg, Sanghoon Moon, Ken Suzuki, Claudia HT Tam, Yasuharu Tabara, Soo-Heon Kwak, Fumihiko Takeuchi, Jirong Long, Victor JY Lim, Jin-Fang Chai, Chien-Hsiun Chen, Masahiro Nakatochi, Jie Yao, Hyeok Sun Choi, Apoorva K Iyengar, Hannah J Perrin, Sarah M Brotman, Martijn van de Bunt, Anna L Gloyn, Jennifer E Below, Michael Boehnke, Donald W. Bowden, John C Chambers, Anubha Mahajan, Mark I McCarthy, Maggie C.Y. Ng, Lauren E Petty, Weihua Zhang, Andrew P Morris, Linda S Adair, Zheng Bian, Juliana CN Chan, Li-Ching Chang, Miao-Li Chee, Yii-Der Ida Chen, Yuan-Tsong Chen, Zhengming Chen, Lee-Ming Chuang, Shufa Du, Penny Gordon-Larsen, Myron Gross, Xiuqing Guo, Yu Guo, Sohee Han, Annie-Green Howard, Wei Huang, Yi-Jen Hung, Mi Yeong Hwang, Chii-Min Hwu, Sahoko Ichihara, Masato Isono, Hye-Mi Jang, Guozhi Jiang, Jost B Jonas, Yoichiro Kamatani, Tomohiro Katsuya, Takahisa Kawaguchi, Chiea-Chuen Khor, Katsuhiko Kohara, Myung-Shik Lee, Nannette R Lee, Liming Li, Jianjun Liu, Andrea O Luk, Jun Lv, Yukinori Okada, Mark A Pereira, Charumathi Sabanayagam, Jinxiu Shi, Dong Mun Shin, Wing Yee So, Atsushi Takahashi, Brian Tomlinson, Fuu-Jen Tsai, Rob M van Dam, Yong-Bing Xiang, Ken Yamamoto, Toshimasa Yamauchi, Kyungheon Yoon, Canqing Yu, Jian-Min Yuan, Liang Zhang, Wei Zheng, Michiya Igase, Yoon Shin Cho, Jerome I Rotter, Ya-Xing Wang, Wayne HH Sheu, Mitsuhiro Yokota, Jer-Yuarn Wu, Ching-Yu Cheng, Tien-Yin Wong, Xiao-Ou Shu, Norihiro Kato, Kyong-Soo Park, E-Shyong Tai, Fumihiko Matsuda, Woon-Puay Koh, Ronald CW Ma, Shiro Maeda, Iona Y. Millwood, Juyoung Lee, Takashi Kadowaki, Robin G Walters, Bong-Jo Kim, Karen L Mohlke, Xueling Sim
Meta-analyses of genome-wide association studies (GWAS) have identified >240 loci associated with type 2 diabetes (T2D), however most loci have been identified in analyses of European-ancestry individuals. To examine T2D risk in East Asian individuals, we meta-analyzed GWAS data in 77,418 cases and 356,122 controls. In the main analysis, we identified 298 distinct association signals at 178 loci, and across T2D association models with and without consideration of body mass index and sex, we identified 56 loci newly implicated in T2D predisposition. Common variants associated with T2D in both East Asian and European populations exhibited strongly correlated effect sizes. New associations include signals in/near GDAP1 , PTF1A , SIX3, ALDH2, a microRNA cluster, and genes that affect muscle and adipose differentiation. At another locus, eQTLs at two overlapping T2D signals act through two genes, NKX6-3 and ANK1 , in different tissues. Association studies in diverse populations identify additional loci and elucidate disease genes, biology, and pathways. Type 2 diabetes (T2D) is a common metabolic disease primarily caused by insufficient insulin production and/or secretion by the pancreatic β cells and insulin resistance in peripheral tissues. Most genetic loci associated with T2D have been identified in populations of European (EUR) ancestry, including a recent meta-analysis of genome-wide association studies (GWAS) of nearly 900,000 individuals of European ancestry that identified >240 loci influencing the risk of T2D. Differences in allele frequency between ancestries affect the power to detect associations within a population, particularly among variants rare or monomorphic in one population but more frequent in another,. Although smaller than studies in European populations, a recent T2D meta-analysis in almost 200,000 Japanese individuals identified 28 additional loci. The relative contributions of different pathways to the pathophysiology of T2D may also differ between ancestry groups. For example, in East Asian (EAS) populations, T2D prevalence is greater than in European populations among people of similar body mass index (BMI) or waist circumference. We performed the largest meta-analysis of East Asian individuals to identify new genetic associations and provide insight into T2D pathogenesis. : #ref-1 : #ref-2 : #ref-3 : #ref-4 : #ref-5
374 downloads bioinformatics
As single-cell transcriptomics becomes a mainstream technology, the natural next step is to integrate the accumulating data in order to achieve a common ontology of cell types and states. However, owing to various nuisance factors of variation, it is not straightforward how to compare gene expression levels across data sets and how to automatically assign cell type labels in a new data set based on existing annotations. In this manuscript, we demonstrate that our previously developed method, scVI, provides an effective and fully probabilistic approach for joint representation and analysis of cohorts of single-cell RNA-seq data sets, while accounting for uncertainty caused by biological and measurement noise. We also introduce single-cell ANnotation using Variational Inference (scANVI), a semi-supervised variant of scVI designed to leverage any available cell state annotations --- for instance when only one data set in a cohort is annotated, or when only a few cells in a single data set can be labeled using marker genes. We demonstrate that scVI and scANVI compare favorably to the existing methods for data integration and cell state annotation in terms of accuracy, scalability, and adaptability to challenging settings such as a hierarchical structure of cell state labels. We further show that different from existing methods, scVI and scANVI represent the integrated datasets with a single generative model that can be directly used for any probabilistic decision making task, using differential expression as our case study. scVI and scANVI are available as open source software and can be readily used to facilitate cell state annotation and help ensure consistency and reproducibility across studies.
374 downloads cancer biology
Peter Priestley, Jonathan Baber, Martijn Lolkema, Neeltje Steeghs, Ewart de Bruijn, Korneel Duyvesteyn, Susan Haidari, Arne van Hoeck, Wendy Onstenk, Paul Roepman, Charles Shale, Mircea Voda, Haiko Bloemendal, Vivianne Tjan-Heijnen, Carla van Herpen, Mariette Labots, Petronella Witteveen, Egbert Smit, Stefan Sleijfer, Emile Voest, Edwin Cuppen
Metastatic cancer is one of the major causes of death and is associated with poor treatment efficiency. A better understanding of the characteristics of late stage cancer is required to help tailor personalised treatment, reduce overtreatment and improve outcomes. Here we describe the largest pan-cancer study of metastatic solid tumor genomes, including 2,520 whole genome-sequenced tumor-normal pairs, analyzed at a median depth of 106x and 38x respectively, and surveying over 70 million somatic variants. Metastatic lesions were found to be very diverse, with mutation characteristics reflecting those of the primary tumor types, although with high rates of whole genome duplication events (56%). Metastatic lesions are relatively homogeneous with the vast majority (96%) of driver mutations being clonal and up to 80% of tumor suppressor genes bi-allelically inactivated through different mutational mechanisms. For 62% of all patients, genetic variants that may be associated with outcome of approved or experimental therapies were detected. These actionable events were distributed across various mutation types underlining the importance of comprehensive genomic tumor profiling for cancer precision medicine.
373 downloads bioinformatics
Allen W Zhang, Ciara O'Flanagan, Elizabeth Chavez, Jamie LP Lim, Andrew McPherson, Matt Wiens, Pascale Walters, Tim Chan, Brittany Hewitson, Daniel Lai, Anja Mottok, Clementine Sarkozy, Lauren Chong, Tomohiro Aoki, Xuehai Wang, Andrew P Weng, Jessica N. McAlpine, Samuel Aparicio, Christian Steidl, Kieran R Campbell, Sohrab P Shah
Single-cell RNA sequencing (scRNA-seq) has transformed biomedical research, enabling decomposition of complex tissues into disaggregated, functionally distinct cell types. For many applications, investigators wish to identify cell types with known marker genes. Typically, such cell type assignments are performed through unsupervised clustering followed by manual annotation based on these marker genes, or via "mapping" procedures to existing data. However, the manual interpretation required in the former case scales poorly to large datasets, which are also often prone to batch effects, while existing data for purified cell types must be available for the latter. Furthermore, unsupervised clustering can be error-prone, leading to under- and over- clustering of the cell types of interest. To overcome these issues we present CellAssign, a probabilistic model that leverages prior knowledge of cell type marker genes to annotate scRNA-seq data into pre-defined and de novo cell types. CellAssign automates the process of assigning cells in a highly scalable manner across large datasets while simultaneously controlling for batch and patient effects. We demonstrate the analytical advantages of CellAssign through extensive simulations and exemplify real-world utility to profile the spatial dynamics of high-grade serous ovarian cancer and the temporal dynamics of follicular lymphoma. Our analysis reveals subclonal malignant phenotypes and points towards an evolutionary interplay between immune and cancer cell populations with cancer cells escaping immune recognition.
373 downloads genomics
Julien Bryois, Nathan G. Skene, Thomas Folkmann Hansen, Lisette J.A. Kogelman, Hunna J Watson, Eating Disorders Working Group of the Psychiatric, International Headache Genetics Consortium, The 23andMe Research Team, Leo Brueggeman, Gerome Breen, Cynthia M. Bulik, Ernest Arenas, Jens Hjerling-Leffler, Patrick F Sullivan
Genome-wide association studies (GWAS) have discovered hundreds of loci associated with complex brain disorders, and provide the best current insights into the etiology of these idiopathic traits. However, it remains unclear in which cell types these variants are active, which is essential for understanding etiology and subsequent experimental modeling. Here we integrate GWAS results with single-cell transcriptomic data from the entire mouse nervous system to systematically identify cell types underlying psychiatric disorders, neurological diseases, and brain complex traits. We show that psychiatric disorders are predominantly associated with cortical and hippocampal excitatory neurons, and medium spiny neurons from the striatum. Cognitive traits were generally associated with similar cell types but their associations were driven by different genes. Neurological diseases were associated with different cell types, which is consistent with other lines of evidence. Notably, we found that Parkinson's disease is not only genetically associated with dopaminergic neurons but also with serotonergic neurons and cells of the oligodendrocyte lineage. Using post-mortem brain transcriptomic data, we confirmed alterations in these cells, even at the earliest stages of disease progression. Our study provides an important framework for understanding the cellular basis of complex brain maladies, and reveals an unexpected role of oligodendrocytes in Parkinson's disease.
- Top preprints of 2018
- Paper search
- Author leaderboards
- Overall metrics
- The API
- Email newsletter
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!