Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 52,258 bioRxiv papers from 242,323 authors.

Most downloaded bioRxiv papers, since beginning of last month

50,416 results found. For more information, click each entry to expand.

1: Benchmarking Single-Cell RNA Sequencing Protocols for Cell Atlas Projects
more details view paper

Posted to bioRxiv 13 May 2019

Benchmarking Single-Cell RNA Sequencing Protocols for Cell Atlas Projects
4,288 downloads genomics

Elisabetta Mereu, Atefeh Lafzi, Catia Moutinho, Christoph Ziegenhain, Davis J. MacCarthy, Adrian Alvarez, Eduard Batlle, Sagar, Dominic Grün, Julia K. Lau, Stéphane Boutet, Chad Sanada, Aik Ooi, Robert C. Jones, Kelly Kaihara, Chris Brampton, Yasha Talaga, Yohei Sasagawa, Kaori Tanaka, Tetsutaro Hayashi, Itoshi Nikaido, Cornelius Fischer, Sascha Sauer, Timo Trefzer, Christian Conrad, Xian Adiconis, Lan T. Nguyen, Aviv Regev, Joshua Z Levin, Aleksandar Janjic, Lucas E. Wange, Johannes W. Bagnoli, Swati Parekh, Wolfgang Enard, Marta Gut, Rickard Sandberg, Ivo G Gut, Oliver Stegle, Holger Heyn

Single-cell RNA sequencing (scRNA-seq) is the leading technique for charting the molecular properties of individual cells. The latest methods are scalable to thousands of cells, enabling in-depth characterization of sample composition without prior knowledge. However, there are important differences between scRNA-seq techniques, and it remains unclear which are the most suitable protocols for drawing cell atlases of tissues, organs and organisms. We have generated benchmark datasets to systematically evaluate techniques in terms of their power to comprehensively describe cell types and states. We performed a multi-center study comparing 13 commonly used single-cell and single-nucleus RNA-seq protocols using a highly heterogeneous reference sample resource. Comparative and integrative analysis at cell type and state level revealed marked differences in protocol performance, highlighting a series of key features for cell atlas projects. These should be considered when defining guidelines and standards for international consortia, such as the Human Cell Atlas project.

2: Comprehensive integration of single cell data
more details view paper

Posted to bioRxiv 02 Nov 2018

Comprehensive integration of single cell data
4,217 downloads genomics

Tim Stuart, Andrew Butler, Paul Hoffman, Christoph Hafemeister, Efthymia Papalexi, William M. Mauck, Marlon Stoeckius, Peter Smibert, Rahul Satija

Single cell transcriptomics (scRNA-seq) has transformed our ability to discover and annotate cell types and states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, including high-dimensional immunophenotypes, chromatin accessibility, and spatial positioning, a key analytical challenge is to integrate these datasets into a harmonized atlas that can be used to better understand cellular identity and function. Here, we develop a computational strategy to "anchor" diverse datasets together, enabling us to integrate and compare single cell measurements not only across scRNA-seq technologies, but different modalities as well. After demonstrating substantial improvement over existing methods for data integration, we anchor scRNA-seq experiments with scATAC-seq datasets to explore chromatin differences in closely related interneuron subsets, and project single cell protein measurements onto a human bone marrow atlas to annotate and characterize lymphocyte populations. Lastly, we demonstrate how anchoring can harmonize in-situ gene expression and scRNA-seq datasets, allowing for the transcriptome-wide imputation of spatial gene expression patterns, and the identification of spatial relationships between mapped cell types in the visual cortex. Our work presents a strategy for comprehensive integration of single cell data, including the assembly of harmonized references, and the transfer of information across datasets. Availability: Installation instructions, documentation, and tutorials are available at: https://www.satijalab.org/seurat

3: Report of Partial findings from the National Toxicology Program Carcinogenesis Studies of Cell Phone Radiofrequency Radiation in Hsd: Sprague Dawley® SD rats (Whole Body Exposure)
more details view paper

Posted to bioRxiv 26 May 2016

Report of Partial findings from the National Toxicology Program Carcinogenesis Studies of Cell Phone Radiofrequency Radiation in Hsd: Sprague Dawley® SD rats (Whole Body Exposure)
4,126 downloads cancer biology

Michael Wyde, Mark Cesta, Chad Blystone, Susan Elmore, Paul Foster, Michelle Hooth, Grace Kissling, David Malarkey, Robert Sills, Matthew Stout, Nigel Walker, Kristine Witt, Mary Wolfe, John Bucher

The U.S. National Toxicology Program (NTP) has carried out extensive rodent toxicology and carcinogenesis studies of radiofrequency radiation (RFR) at frequencies and modulations used in the U.S. telecommunications industry. This report presents partial findings from these studies. The occurrences of two tumor types in male Harlan Sprague Dawley rats exposed to RFR, malignant gliomas in the brain and schwannomas of the heart, were considered of particular interest and are the subject of this report. The findings in this report were reviewed by expert peer reviewers selected by the NTP and National Institutes of Health (NIH). These reviews and responses to comments are included as appendices to this report, and revisions to the current document have incorporated and addressed these comments. When the studies are completed, they will undergo additional peer review before publication in full as part of the NTP's Toxicology and Carcinogenesis Technical Reports Series. No portion of this work has been submitted for publication in a scientific journal. Supplemental information in the form of four additional manuscripts has or will soon be submitted for publication. These manuscripts describe in detail the designs and performance of the RFR exposure system, the dosimetry of RFR exposures in rats and mice, the results to a series of pilot studies establishing the ability of the animals to thermoregulate during RFR exposures, and studies of DNA damage. (1) Capstick M, Kuster N, Kuhn S, Berdinas-Torres V, Wilson P, Ladbury J, Koepke G, McCormick D, Gauger J, and Melnick R. A radio frequency radiation reverberation chamber exposure system for rodents; (2) Yijian G, Capstick M, McCormick D, Gauger J, Horn T, Wilson P, Melnick RL, and Kuster N. Life time dosimetric assessment for mice and rats exposed to cell phone radiation; (3) Wyde ME, Horn TL, Capstick M, Ladbury J, Koepke G, Wilson P, Stout MD, Kuster N, Melnick R, Bucher JR, and McCormick D. Pilot studies of the National Toxicology Program's cell phone radiofrequency radiation reverberation chamber exposure system; (4) Smith-Roe SL, Wyde ME, Stout MD, Winters J, Hobbs CA, Shepard KG, Green A, Kissling GE, Tice RR, Bucher JR, and Witt KL. Evaluation of the genotoxicity of cell phone radiofrequency radiation in male and female rats and mice following subchronic exposure.

4: Biological Structure and Function Emerge from Scaling Unsupervised Learning to 250 Million Protein Sequences
more details view paper

Posted to bioRxiv 29 Apr 2019

Biological Structure and Function Emerge from Scaling Unsupervised Learning to 250 Million Protein Sequences
3,799 downloads synthetic biology

Alexander Rives, Siddharth Goyal, Joshua Meier, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, Rob Fergus

In the field of artificial intelligence, a combination of scale in data and model capacity enabled by unsupervised learning has led to major advances in representation learning and statistical generation. In biology, the anticipated growth of sequencing promises unprecedented data on natural sequence diversity. Learning the natural distribution of evolutionary protein sequence variation is a logical step toward predictive and generative modeling for biology. To this end we use unsupervised learning to train a deep contextual language model on 86 billion amino acids across 250 million sequences spanning evolutionary diversity. The resulting model maps raw sequences to representations of biological properties without labels or prior domain knowledge. The learned representation space organizes sequences at multiple levels of biological granularity from the biochemical to proteomic levels. Learning recovers information about protein structure: secondary structure and residue-residue contacts can be extracted by linear projections from learned representations. With small amounts of labeled data, the ability to identify tertiary contacts is further improved. Learning on full sequence diversity rather than individual protein families increases recoverable information about secondary structure. We show the networks generalize by adapting them to variant activity prediction from sequences only, with results that are comparable to a state-of-the-art variant predictor that uses evolutionary and structurally derived features.

5: Toxicity of JUUL Fluids and Aerosols Correlates Strongly with Nicotine and Some Flavor Chemical Concentrations
more details view paper

Posted to bioRxiv 09 Dec 2018

Toxicity of JUUL Fluids and Aerosols Correlates Strongly with Nicotine and Some Flavor Chemical Concentrations
3,400 downloads pharmacology and toxicology

Esther Omaiye, Kevin J McWhirter, Wentai Luo, James F Pankow, Prue Talbot

While JUUL electronic cigarettes (ECs) have captured the majority of the EC market with a large fraction of their sales going to adolescents, little is known about their cytotoxicity and potential effects on health. The purpose of this study was to determine flavor chemical and nicotine concentrations in the eight currently marketed pre-filled JUUL EC cartridges (pods) and to evaluate the cytotoxicity of the different variants (e.g., Cool Mint and Creme Brulee) using in vitro assays. Nicotine and flavor chemicals were analyzed using gas chromatography/mass spectrometry in pod fluid before and after vaping and in the corresponding aerosols. 59 flavor chemicals were identified in JUUL pod fluids, and three were >1 mg/mL. Duplicate pods were similar in flavor chemical composition and concentration. Nicotine concentrations (average 60.9 mg/mL) were significantly higher than any EC products we have analyzed previously. Transfer efficiency of individual flavor chemicals that were >1mg/mL and nicotine from the pod fluid into aerosols was generally 35 - 80%. All pod fluids were cytotoxic at a 1:10 dilution (10%) in the MTT and neutral red uptake assays when tested with BEAS-2B lung epithelial cells. Most aerosols were cytotoxic in these assays at concentrations >1%. The cytotoxicity of aerosols was highly correlated with nicotine and ethyl maltol concentrations and moderately to weakly correlated with total flavor chemical concentration and menthol concentration. Our study demonstrates that: (1) some JUUL flavor pods have high concentrations of flavor chemicals that may make them attractive to youth, and (2) the concentrations of nicotine and some flavor chemicals (e.g. ethyl maltol) are high enough to be cytotoxic in acute in vitro assays, emphasizing the need to determine if JUUL products will lead to adverse health effects with chronic use.

6: Using Deep Learning to Annotate the Protein Universe
more details view paper

Posted to bioRxiv 03 May 2019

Using Deep Learning to Annotate the Protein Universe
3,317 downloads bioinformatics

Maxwell L Bileschi, David Belanger, Drew H Bryant, Theo Sanderson, Brandon Carter, D. Sculley, Mark L DePristo, Lucy J Colwell

Understanding the relationship between amino acid sequence and protein function is a long-standing problem in molecular biology with far-reaching scientific implications. Despite six decades of progress, state-of-the-art techniques cannot annotate 1/3 of microbial protein sequences, hampering our ability to exploit sequences collected from diverse organisms. To address this, we report a deep learning model that learns the relationship between unaligned amino acid sequences and their functional classification across all 17929 families of the Pfam database. Using the Pfam seed sequences we establish a rigorous benchmark assessment and find a dilated convolutional model that reduces the error of both BLASTp and pHMMs by a factor of nine. Using 80% of the full Pfam database we train a protein family predictor that is more accurate and over 200 times faster than BLASTp, while learning sequence features it was not trained on such as structural disorder and transmembrane helices. Our model co-locates sequences from unseen families in embedding space, allowing sequences from novel families to be accurately annotated. These results suggest deep learning models will be a core component of future protein function prediction tools.

7: Systematic comparative analysis of single cell RNA-sequencing methods
more details view paper

Posted to bioRxiv 09 May 2019

Systematic comparative analysis of single cell RNA-sequencing methods
3,231 downloads genomics

Jiarui Ding, Xian Adiconis, Sean K Simmons, Monika S. Kowalczyk, Cynthia C. Hession, Nemanja D. Marjanovic, Travis K Hughes, Marc H Wadsworth, Tyler Burks, Lan T. Nguyen, John Y. H. Kwon, Boaz Barak, William Ge, Amanda J. Kedaigle, Shaina Carroll, Shuqiang Li, Nir Hacohen, Orit Rozenblatt-Rosen, Alex K Shalek, Alexandra-Chloé Villani, Aviv Regev, Joshua Z Levin

A multitude of single-cell RNA sequencing methods have been developed in recent years, with dramatic advances in scale and power, and enabling major discoveries and large scale cell mapping efforts. However, these methods have not been systematically and comprehensively benchmarked. Here, we directly compare seven methods for single cell and/or single nucleus profiling from three types of samples -- cell lines, peripheral blood mononuclear cells and brain tissue -- generating 36 libraries in six separate experiments in a single center. To analyze these datasets, we developed and applied scumi, a flexible computational pipeline that can be used for any scRNA-seq method. We evaluated the methods for both basic performance and for their ability to recover known biological information in the samples. Our study will help guide experiments with the methods in this study as well as serve as a benchmark for future studies and for computational algorithm development.

8: A guide to performing Polygenic Risk Score analyses
more details view paper

Posted to bioRxiv 14 Sep 2018

A guide to performing Polygenic Risk Score analyses
2,926 downloads genomics

Shing Wan Choi, Timothy Mak, Paul F O'Reilly

The application of polygenic risk scores (PRS) has become routine in genetic epidemiological studies. Among a range of applications, PRS are commonly used to assess shared aetiology among different phenotypes and to evaluate the predictive power of genetic data, while they are also now being exploited as part of study design, in which experiments are performed on individuals, or their biological samples (eg. tissues, cells), at the tails of the PRS distribution and contrasted. As GWAS sample sizes increase and PRS become more powerful, they are also set to play a key role in personalised medicine. Despite their growing application and importance, there are limited guidelines for performing PRS analyses, which can lead to inconsistency between studies and misinterpretation of results. Here we provide detailed guidelines for performing polygenic risk score analyses relevant to different methods for their calculation, outlining standard quality control steps and offering recommendations for best-practice. We also discuss different methods for the calculation of PRS, common misconceptions regarding the interpretation of results and future challenges.

9: The Genomic Formation of South and Central Asia
more details view paper

Posted to bioRxiv 31 Mar 2018

The Genomic Formation of South and Central Asia
2,738 downloads genomics

Vagheesh M Narasimhan, Nick J Patterson, Priya Moorjani, Iosif Lazaridis, Lipson Mark, Swapan Mallick, Nadin Rohland, Rebecca Bernardos, Alexander M Kim, Nathan Nakatsuka, Inigo Olalde, Alfredo Coppa, James Mallory, Vyacheslav Moiseyev, Janet Monge, Luca M Olivieri, Nicole Adamski, Nasreen Broomandkhoshbacht, Francesca Candilio, Olivia Cheronet, Brendan J Culleton, Matthew Ferry, Daniel Fernandes, Beatriz Gamarra, Daniel Gaudio, Mateja Hajdinjak, Eadaoin Harney, Thomas K Harper, Denise Keating, Ann-Marie Lawson, Megan Michel, Mario Novak, Jonas Oppenheimer, Niraj Rai, Kendra Sirak, Viviane Slon, Kristin Stewardson, Zhao Zhang, Gaziz Akhatov, Anatoly N Bagashev, Baurzhan Baitanayev, Gian Luca Bonora, Tatiana Chikisheva, Anatoly Derevianko, Enshin Dmitry, Katerina Douka, Nadezhda Dubova, Andrey Epimakhov, Suzanne Freilich, Dorian Fuller, Alexander Goryachev, Andrey Gromov, Bryan Hanks, Margaret Judd, Erlan Kazizov, Aleksander Khokhlov, Egor Kitov, Elena Kupriyanova, Pavel Kuznetsov, Donata Luiselli, Farhad Maksudov, Chris Meiklejohn, Deborah C Merrett, Roberto Micheli, Oleg Mochalov, Zahir Muhammed, Samridin Mustafakulov, Ayushi Nayak, Rykun M Petrovna, Davide Pettner, Richard Potts, Dmitry Razhev, Stefania Sarno, Kulyan Sikhymbaevae, Sergey M Slepchenko, Nadezhda Stepanova, Svetlana Svyatko, Sergey Vasilyev, Massimo Vidale, Dima Voyakin, Antonina Yermolayeva, Alisa Zubova, Vasant S Shinde, Carles Lalueza-Fox, Matthias Meyer, David Anthony, Nicole Boivin, Kumarasmy Thangaraj, Douglas Kennett, Michael Frachetti, Ron Pinhasi, David Reich

The genetic formation of Central and South Asian populations has been unclear because of an absence of ancient DNA. To address this gap, we generated genome-wide data from 362 ancient individuals, including the first from eastern Iran, Turan (Uzbekistan, Turkmenistan, and Tajikistan), Bronze Age Kazakhstan, and South Asia. Our data reveal a complex set of genetic sources that ultimately combined to form the ancestry of South Asians today. We document a southward spread of genetic ancestry from the Eurasian Steppe, correlating with the archaeologically known expansion of pastoralist sites from the Steppe to Turan in the Middle Bronze Age (2300-1500 BCE). These Steppe communities mixed genetically with peoples of the Bactria Margiana Archaeological Complex (BMAC) whom they encountered in Turan (primarily descendants of earlier agriculturalists of Iran), but there is no evidence that the main BMAC population contributed genetically to later South Asians. Instead, Steppe communities integrated farther south throughout the 2nd millennium BCE, and we show that they mixed with a more southern population that we document at multiple sites as outlier individuals exhibiting a distinctive mixture of ancestry related to Iranian agriculturalists and South Asian hunter-gathers. We call this group Indus Periphery because they were found at sites in cultural contact with the Indus Valley Civilization (IVC) and along its northern fringe, and also because they were genetically similar to post-IVC groups in the Swat Valley of Pakistan. By co-analyzing ancient DNA and genomic data from diverse present-day South Asians, we show that Indus Periphery-related people are the single most important source of ancestry in South Asia — consistent with the idea that the Indus Periphery individuals are providing us with the first direct look at the ancestry of peoples of the IVC — and we develop a model for the formation of present-day South Asians in terms of the temporally and geographically proximate sources of Indus Periphery-related, Steppe, and local South Asian hunter-gatherer-related ancestry. Our results show how ancestry from the Steppe genetically linked Europe and South Asia in the Bronze Age, and identifies the populations that almost certainly were responsible for spreading Indo-European languages across much of Eurasia.

10: Cellular and Molecular Probing of Intact Transparent Human Organs
more details view paper

Posted to bioRxiv 21 May 2019

Cellular and Molecular Probing of Intact Transparent Human Organs
2,277 downloads cell biology

Shan Zhau, Mihail Ivilinov Todorov, Ruiyao Cai, Hanno Steinke, Elisabeth Kemter, Eckhard Wolf, Jan Lipfert, Ingo Bechmann, Ali Erturk

Optical tissue transparency permits cellular and molecular investigation of complex tissues in 3D, a fundamental need in biomedical sciences. Adult human organs are particularly challenging for this approach, owing to the accumulation of dense and sturdy molecules in decades-aged human tissues. Here, we introduce SHANEL method utilizing a new tissue permeabilization approach to clear and label stiff human organs. We used SHANEL to generate the first intact transparent adult human brain and kidney, and perform 3D histology using antibodies and dyes in centimeters depth. Thereby, we revealed structural details of the sclera, iris and suspensory ligament in the human eye, and the vessels and glomeruli in the human kidney. We also applied SHANEL on transgenic pig organs to map complex structures of EGFP expressing beta cells in >10 cm size pancreas. Overall, SHANEL is a robust and unbiased technology to chart the cellular and molecular architecture of intact large mammalian organs.

11: Variable prediction accuracy of polygenic scores within an ancestry group
more details view paper

Posted to bioRxiv 07 May 2019

Variable prediction accuracy of polygenic scores within an ancestry group
2,169 downloads genetics

Hakhamanesh Mostafavi, Arbel Harpak, Dalton Conley, Jonathan K Pritchard, Molly Przeworski

Fields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group, the prediction accuracy of polygenic scores depends on characteristics such as the age or sex composition of the individuals in which the GWAS and the prediction were conducted, and on the GWAS study design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.

12: Resolving the 3D landscape of transcription-linked mammalian chromatin folding
more details view paper

Posted to bioRxiv 17 May 2019

Resolving the 3D landscape of transcription-linked mammalian chromatin folding
2,096 downloads genomics

Tsung-Han S Hsieh, Elena Slobodyanyuk, Anders Sejr Hansen, Claudia Cattoglio, Oliver Rando, Robert Tjian, Xavier Darzacq

Chromatin folding below the scale of topologically associating domains (TADs) remains largely unexplored in mammals. Here, we used a high-resolution 3C-based method, Micro-C, to probe links between 3D-genome organization and transcriptional regulation in mouse stem cells. Combinatorial binding of transcription factors, cofactors, and chromatin modifiers spatially segregate TAD regions into "microTADs" with distinct regulatory features. Enhancer-promoter and promoter-promoter interactions extending from the edge of these domains predominantly link co-regulated loci, often independently of CTCF/Cohesin. Acute inhibition of transcription disrupts the gene-related folding features without altering higher-order chromatin structures. Intriguingly, we detect "two-start" zig-zag 30-nanometer chromatin fibers. Our work uncovers the finer-scale genome organization that establishes novel functional links between chromatin folding and gene regulation.

13: A tutorial on Gaussian process regression: Modelling, exploring, and exploiting functions
more details view paper

Posted to bioRxiv 19 Dec 2016

A tutorial on Gaussian process regression: Modelling, exploring, and exploiting functions
1,950 downloads animal behavior and cognition

Eric Schulz, Maarten Speekenbrink, Andreas Krause

This tutorial introduces the reader to Gaussian process regression as an expressive tool to model, actively explore and exploit unknown functions. Gaussian process regression is a powerful, non-parametric Bayesian approach towards regression problems that can be utilized in exploration and exploitation scenarios. This tutorial aims to provide an accessible introduction to these techniques. We will introduce Gaussian processes which generate distributions over functions used for Bayesian non-parametric regression, and demonstrate their use in applications and didactic examples including simple regression problems, a demonstration of kernel-encoded prior assumptions and compositions, a pure exploration scenario within an optimal design framework, and a bandit-like exploration-exploitation scenario where the goal is to recommend movies. Beyond that, we describe a situation modelling risk-averse exploration in which an additional constraint (not to sample below a certain threshold) needs to be accounted for. Lastly, we summarize recent psychological experiments utilizing Gaussian processes. Software and literature pointers are also provided.

14: Clonal replacement of tumor-specific T cells following PD-1 blockade
more details view paper

Posted to bioRxiv 24 May 2019

Clonal replacement of tumor-specific T cells following PD-1 blockade
1,936 downloads immunology

Kathryn E Yost, Ansuman T. Satpathy, Daniel K. Wells, Yanyan Qi, Chunlin Wang, Robin Kageyama, Katherine McNamara, Jeffrey M. Granja, Kavita Y. Sarin, Ryanne A. Brown, Rohit K. Gupta, Christina Curtis, Samantha L. Bucktrout, Mark M. Davis, Anne Lynn S. Chang, Howard Y. Chang

Immunotherapies that block inhibitory checkpoint receptors on T cells have transformed the clinical care of cancer patients. However, which tumor-specific T cells are mobilized following checkpoint blockade remains unclear. Here, we performed paired single-cell RNA- and T cell receptor (TCR)- sequencing on 79,046 cells from site-matched tumors from patients with basal cell carcinoma (BCC) or squamous cell carcinoma (SCC) pre- and post-anti-PD-1 therapy. Tracking TCR clones and transcriptional phenotypes revealed a coupling of tumor-recognition, clonal expansion, and T cell dysfunction: the T cell response to treatment was accompanied by clonal expansions of CD8+CD39+ T cells, which co-expressed markers of chronic T cell activation and exhaustion. However, this expansion did not derive from pre-existing tumor infiltrating T cell clones; rather, it comprised novel clonotypes, which were not previously observed in the same tumor. Clonal replacement of T cells was preferentially observed in exhausted CD8+ T cells, compared to other distinct T cell phenotypes, and was evident in BCC and SCC patients. These results, enabled by single-cell multi-omic profiling of clinical samples, demonstrate that pre-existing tumor-specific T cells may be limited in their capacity for re-invigoration, and that the T cell response to checkpoint blockade relies on the expansion of a distinct repertoire of T cell clones that may have just recently entered the tumor.

15: Genome-wide Variants of Eurasian Facial Shape Differentiation and a prospective model of DNA based Face Prediction
more details view paper

Posted to bioRxiv 10 Jul 2016

Genome-wide Variants of Eurasian Facial Shape Differentiation and a prospective model of DNA based Face Prediction
1,885 downloads genetics

Lu Qiao, Yajun Yang, Pengcheng Fu, Sile Hu, Hang Zhou, Jingze Tan, Yan Lu, Haiyi Lou, Dongsheng Lu, Sijie Wu, Jing Guo, Shouneng Peng, Li Jin, Yaqun Guan, Sijia Wang, Shuhua Xu, Kun Tang

It is a long standing question as to which genes define the characteristic facial features among different ethnic groups. In this study, we use Uyghurs, an ancient admixed population to query the genetic bases why Europeans and Han Chinese look different. Facial traits were analyzed based on high-dense 3D facial images; numerous biometric spaces were examined for divergent facial features between European and Han Chinese, ranging from inter-landmark distances to dense shape geometrics. Genome-wide association analyses were conducted on a discovery panel of Uyghurs. Six significant loci were identified four of which, rs1868752, rs118078182, rs60159418 at or near UBASH3B, COL23A1, PCDH7 and rs17868256 were replicated in independent cohorts of Uyghurs or Southern Han Chinese. A prospective model was also developed to predict 3D faces based on top GWAS signals, and tested in hypothetic forensic scenarios.

16: Neural Population Control via Deep Image Synthesis
more details view paper

Posted to bioRxiv 04 Nov 2018

Neural Population Control via Deep Image Synthesis
1,824 downloads neuroscience

Pouya Bashivan, Kohitij Kar, James J DiCarlo

Particular deep artificial neural networks (ANNs) are today's most accurate models of the primate brain's ventral visual stream. Here we report that, using a targeted ANN-driven image synthesis method, new luminous power patterns (i.e. images) can be applied to the primate retinae to predictably push the spiking activity of targeted V4 neural sites beyond naturally occurring levels. More importantly, this method, while not yet perfect, already achieves unprecedented independent control of the activity state of entire populations of V4 neural sites, even those with overlapping receptive fields. These results show how the knowledge embedded in today's ANN models might be used to non-invasively set desired internal brain states at neuron-level resolution, and suggest that more accurate ANN models would produce even more accurate control.

17: Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression
more details view paper

Posted to bioRxiv 14 Mar 2019

Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression
1,802 downloads genomics

Christoph Hafemeister, Rahul Satija

Single-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from 'regularized negative binomial regression', where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an unconstrained negative binomial model may overfit scRNA-seq data, and overcome this by pooling information across genes with similar abundances to obtain stable parameter estimates. Our procedure omits the need for heuristic steps including pseudocount addition or log-transformation, and improves common downstream analytical tasks such as variable gene selection, dimensional reduction, and differential expression. Our approach can be applied to any UMI-based scRNA-seq dataset and is freely available as part of the R package sctransform (https://github.com/ChristophH/sctransform), with a direct interface to our single-cell toolkit Seurat.

18: Spatiotemporal limits of optogenetic manipulations in cortical circuits
more details view paper

Posted to bioRxiv 20 May 2019

Spatiotemporal limits of optogenetic manipulations in cortical circuits
1,768 downloads neuroscience

Nuo Li, Susu Chen, Zengcai V. Guo, Han Chen, Yan Huo, Hidehiko Inagaki, Courtney Davis, David Hansel, Caiying Guo, Karel Svoboda

Neuronal inactivation is commonly used to assess the involvement of groups of neurons in specific brain functions. Optogenetic tools allow manipulations of genetically and spatially defined neuronal populations with excellent temporal resolution. However, the targeted neurons are coupled with other neural populations over multiple length scales. As a result, the effects of localized optogenetic manipulations are not limited to the targeted neurons, but produces spatially extended excitation and inhibition with rich dynamics. Here we benchmarked several optogenetic silencers in transgenic mice and with viral gene transduction, with the goal to inactivate excitatory neurons in small regions of neocortex. We analyzed the effects of the perturbations in vivo using electrophysiology. Channelrhodopsin activation of GABAergic neurons produced more effective photoinhibition of pyramidal neurons than direct photoinhibition using light-gated ion pumps. We made transgenic mice expressing the light-dependent chloride channel GtACR under the control of Cre-recombinase. Activation of GtACR produced the most potent photoinhibition. For all methods, localized photostimuli produced photoinhibition that extended substantially beyond the spread of light in tissue, although different methods had slightly different resolution limits (radius of inactivation, 0.5 mm to 1 mm). The spatial profile of photoinhibition was likely shaped by strong coupling between cortical neurons. Over some range of photostimulation, circuits produced the "paradoxical effect", where excitation of inhibitory neurons reduced activity in these neurons, together with pyramidal neurons, a signature of inhibition-stabilized neural networks. The offset of optogenetic inactivation was followed by rebound excitation in a light dose-dependent manner, which can be mitigated by slowly varying photostimuli, but at the expense of time resolution. Our data offer guidance for the design of in vivo optogenetics experiments and suggest how these experiments can reveal operating principles of neural circuits.

19: Performance of neural network basecalling tools for Oxford Nanopore sequencing
more details view paper

Posted to bioRxiv 07 Feb 2019

Performance of neural network basecalling tools for Oxford Nanopore sequencing
1,752 downloads bioinformatics

Ryan R Wick, Louise M Judd, Kathryn Holt

Basecalling, the computational process of translating raw electrical signal to nucleotide sequence, is of critical importance to the sequencing platforms produced by Oxford Nanopore Technologies (ONT). Here we examine the performance of different basecalling tools, looking at accuracy at the level of bases within individual reads and at majority-rules consensus basecalls in an assembly. We also investigate some additional aspects of basecalling: training using a taxon-specific dataset, using a larger neural network model and improving consensus basecalls in an assembly by additional signal-level analysis with Nanopolish. Training basecallers on taxon-specific data results in a significant boost in consensus accuracy, mostly due to the reduction of errors in methylation motifs. A larger neural network is able to improve both read and consensus accuracy, but at a cost to speed. Improving consensus sequences ('polishing') with Nanopolish somewhat negates the accuracy differences in basecallers, but prepolish accuracy does have an effect on post-polish accuracy. Basecalling accuracy has seen significant improvements over the last two years. The current version of ONT's Guppy basecaller performs well overall, with good accuracy and fast performance. If higher accuracy is required, users should consider producing a custom model using a larger neural network and/or training data from the same species.

20: Evolving super stimuli for real neurons using deep generative networks
more details view paper

Posted to bioRxiv 17 Jan 2019

Evolving super stimuli for real neurons using deep generative networks
1,731 downloads neuroscience

Carlos R Ponce, Will Xiao, Peter Schade, Till S Hartmann, Gabriel Kreiman, Margaret S Livingstone

Finding the best stimulus for a neuron is challenging because it is impossible to test all possible stimuli. Here we used a vast, unbiased, and diverse hypothesis space encoded by a generative deep neural network model to investigate neuronal selectivity in inferotemporal cortex without making any assumptions about natural features or categories. A genetic algorithm, guided by neuronal responses, searched this space for optimal stimuli. Evolved synthetic images evoked higher firing rates than even the best natural images and revealed diagnostic features, independently of category or feature selection. This approach provides a way to investigate neural selectivity in any modality that can be represented by a neural network and challenges our understanding of neural coding in visual cortex.

Previous page 1 2 3 4 5 . . . 2521 Next page

Sign up for the Rxivist weekly newsletter! (Click here for more details.)