Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 42,904 bioRxiv papers from 193,370 authors.

Most tweeted bioRxiv papers, last 24 hours

258 results found. For more information, click each entry to expand.

61: Generating High Density, Low Cost Genotype Data in Soybean [Glycine max (L.) Merr.]
more details view paper

Posted to bioRxiv 13 Feb 2019

Generating High Density, Low Cost Genotype Data in Soybean [Glycine max (L.) Merr.]
4 tweets genomics

Mary Happ, Haichuan Wang, George Graef, David Hyten

Obtaining genome-wide genotype information for millions of SNPs in soybean [Glycine max (L.) Merr.] often involves completely resequencing a line at 5X or greater coverage. Currently, hundreds of soybean lines have been resequenced at high depth levels with their data deposited in the NCBI short read achieve. This publicly available dataset may be leveraged as an imputation reference panel in combination with skim (low coverage) sequencing of new soybean genotypes to economically obtain high-density SNP information. Ninety-nine soybean lines resequenced at an average of 17.1X were used to generate a reference panel, with over 10 million SNPs called using GATK's Haplotype Caller tool. Whole genome resequencing at approximately 1X depth was performed on 114 previously ungenotyped experimental soybean lines. Coverages down to 0.1X were analyzed by randomly subsetting raw reads from the original 1X sequence data. SNPs discovered in the reference panel were genotyped in the experimental lines after aligning to the soybean reference genome, and missing markers imputed using Beagle 4.1. Sequencing depth of the experimental lines could be reduced to 0.3X while still retaining an accuracy of 97.8%. Accuracy was inversely related to minor allele frequency, and highly correlated with marker linkage disequilibrium. The high accuracy of skim sequencing combined with imputation provides a low cost method for obtaining dense genotypic information that can be used for various genomics applications in soybean.

62: Transposable elements drive reorganisation of 3D chromatin during early embryogenesis
more details view paper

Posted to bioRxiv 17 Jan 2019

Transposable elements drive reorganisation of 3D chromatin during early embryogenesis
4 tweets genomics

Kai Kruse, Noelia Díaz Blanco, Rocio Enriquez-Gasca, Xavier Gaume, Maria-Elena Torres-Padilla, Juan Manuel Vaquerizas

Transposable elements are abundant genetic components of eukaryotic genomes with important regulatory features affecting transcription, splicing, and recombination, among others. Here we demonstrate that the Murine Endogenous Retroviral Element (MuERV-L/MERVL) family of transposable elements drives the 3D reorganisation of the genome in the early mouse embryo. By generating Hi-C data in 2-cell-like cells, we show that MERLV elements promote the formation of insulating domain boundaries throughout the genome in vivo and in vitro. The formation of these boundaries is coupled to the upregulation of directional transcription from MERVL, which results in the activation of a subset of the gene expression programme of the 2-cell stage embryo. Domain boundaries in the 2-cell stage embryo are transient and can be remodelled without undergoing cell division. Remarkably, we find extensive inter-strain MERVL variation, suggesting multiple non-overlapping rounds of recent genome invasion and a high regulatory plasticity of genome organisation. Our results demonstrate that MERVL drive chromatin organisation during early embryonic development shedding light into how nuclear organisation emerges during zygotic genome activation in mammals.

63: Genome Graphs
more details view paper

Posted to bioRxiv 18 Jan 2017

Genome Graphs
4 tweets bioinformatics

Adam M Novak, Glenn Hickey, Erik Garrison, Sean Blum, Abram Connelly, Alexander Dilthey, Jordan Eizenga, M. A. Saleh Elmohamed, Sally Guthrie, André Kahles, Stephen Keenan, Jerome Kelleher, Deniz Kural, Heng Li, Michael F Lin, Karen Miga, Nancy Ouyang, Goran Rakocevic, Maciek Smuga-Otto, Alexander Wait Zaranek, Richard Durbin, Gil McVean, David Haussler, Benedict Paten

There is increasing recognition that a single, monoploid reference genome is a poor universal reference structure for human genetics, because it represents only a tiny fraction of human variation. Adding this missing variation results in a structure that can be described as a mathematical graph: a genome graph. We demonstrate that, in comparison to the existing reference genome (GRCh38), genome graphs can substantially improve the fractions of reads that map uniquely and perfectly. Furthermore, we show that this fundamental simplification of read mapping transforms the variant calling problem from one in which many non-reference variants must be discovered de-novo to one in which the vast majority of variants are simply re-identified within the graph. Using standard benchmarks as well as a novel reference-free evaluation, we show that a simplistic variant calling procedure on a genome graph can already call variants at least as well as, and in many cases better than, a state-of-the-art method on the linear human reference genome. We anticipate that graph-based references will supplant linear references in humans and in other applications where cohorts of sequenced individuals are available.

64: Sox17 expression in endocardium precursor cells regulates heart development in mice
more details view paper

Posted to bioRxiv 14 Feb 2019

Sox17 expression in endocardium precursor cells regulates heart development in mice
4 tweets developmental biology

Rie Saba, Keiko Kitajima, Lucille Rainbow, Sylvia Engert, Mami Uemura, Hidekazu Ishida, Ioannis Kokkinopoulos, Yasunori Shintani, Shigeru Miyagawa, Yoshiakira Kanai, Masami Azuma-Kanai, Peter Koopman, Chikara Meno, John Kenny, Heiko Lickert, Yumiko Saga, Ken Suzuki, Yoshiki Sawa, Kenta Yashiro

The endocardium is the endothelial component of the vertebrate heart and plays a key role in heart development. Cardiac progenitor cells (CPCs) that express the homeobox gene Nkx2-5 give rise to the endocardium. Where, when, and how the endocardium segregates during embryogenesis have remained largely unknown, however. We now show that Nkx2-5+ CPCs that express the Sry-type HMG box gene Sox17 specifically differentiate into the endocardium in mouse embryos. Approximately 20% to 30% of Nkx2-5+ CPCs transiently express Sox17 from embryonic day (E) 7.5 to E8.5. Although Sox17 is not essential or sufficient for endocardium fate, it can bias the fate of CPCs toward the endocardium. On the other hand, Sox17 expression in the endocardium is required for heart development. Deletion of Sox17 specifically in the mesoderm markedly impaired endocardium development with regard to cell proliferation and behavior. The proliferation of cardiomyocytes, ventricular trabeculation, and myocardium thickening were also impaired in a non-cell-autonomous manner in the Sox17 mutant, resulting in anomalous morphology of the heart, likely as a consequence of down-regulation of NOTCH signaling. Changes in gene expression profile in both the endocardium and myocardium preceded the reduction in NOTCH-related gene expression in the mutant embryos, suggesting that Sox17 expression in the endocardium regulates an unknown signal required for nurturing of the myocardium. Our results thus provide insight into differentiation of the endocardium and its role in heart development.

65: Aversive learning strengthens episodic memory in both adolescents and adults
more details view paper

Posted to bioRxiv 12 Feb 2019

Aversive learning strengthens episodic memory in both adolescents and adults
3 tweets animal behavior and cognition

Alexandra O. Cohen, Nicholas G. Matese, Anastasia Filimontseva, Xinxu Shen, Tracey C Shi, Ethan Livne, Catherine A Hartley

Adolescence is often filled with positive and negative emotional experiences that may change how individuals remember and respond to stimuli in their environment. In adults, aversive events can both enhance memory for associated stimuli as well as generalize to enhance memory for unreinforced but conceptually related stimuli. The present study tested whether learned aversive associations similarly lead to better memory and generalization across a category of stimuli in adolescents. Participants completed an olfactory Pavlovian category conditioning task in which trial-unique exemplars from one of two categories were partially reinforced with an aversive odor. Participants then returned 24-hours later to complete a surprise recognition memory test. We found better corrected recognition memory for the reinforced versus the unreinforced category of stimuli in both adults and adolescents. Further analysis revealed that enhanced recognition memory was driven specifically by better memory for the reinforced exemplars. Autonomic arousal during learning was also related to subsequent memory. These findings build on previous work in adolescent and adult humans and rodents showing comparable acquisition of aversive Pavlovian conditioned responses across age groups and demonstrate that memory for stimuli with an acquired aversive association is enhanced in both adults and adolescents.

66: Scalable multiple whole-genome alignment and locally collinear block construction with SibeliaZ
more details view paper

Posted to bioRxiv 13 Feb 2019

Scalable multiple whole-genome alignment and locally collinear block construction with SibeliaZ
3 tweets bioinformatics

Ilia Minkin, Paul Medvedev

Multiple whole-genome alignment is a fundamental and challenging problems in bioinformatics. Despite many ongoing successes, today's methods are not able to keep up with the growing number, length, and complexity of assembled genomes. Approaches based on using compacted de Bruijn graphs to identify and extend anchors into locally collinear blocks hold the potential for scalability, but current algorithms still do not scale to mammalian genomes. We present a novel algorithm SibeliaZ-LCB for identifying collinear blocks in closely related genomes based on the analysis of the de Bruijn graph. We further incorporate it into a multiple whole-genome alignment pipeline called SibeliaZ. SibeliaZ shows drastic run-time improvements over other methods on both simulated and real data, with only a limited decrease in accuracy. On sixteen recently assembled strains of mice, SibeliaZ runs in under 12 hours, while other tools could not run to completion for even eight mice, given a week. SibeliaZ makes a signicant step towards improving scalability of multiple whole-genome alignment and collinear block reconstruction algorithms and will enable many comparative genomics studies in the near future.

67: Machine Learning Classification of Attention-Deficit/Hyperactivity Disorder Using Structural MRI Data
more details view paper

Posted to bioRxiv 11 Feb 2019

Machine Learning Classification of Attention-Deficit/Hyperactivity Disorder Using Structural MRI Data
3 tweets bioinformatics

Yanli Zhang-James, Emily Helminen, Jinru Liu, the ENIGMA-ADHD working group, Barbara Franke, Martine Hoogman, Stephen Faraone

Background: Clinical symptoms-based ADHD diagnosis is considered "subjective". Machine learning (ML) classifiers have been explored to develop objective diagnosis of ADHD using magnetic resonance imaging (MRI) biomarkers. Methods: We reviewed previous literature and developed ensemble classifiers using the ENIGMA-ADHD dataset, with the implementation of data balancing to control for age, sex, diagnostic groups, and sample sites and a held-out test set for independent evaluation. Results: Our review showed that classification accuracies reported previously using cross-validation (CV) samples were inflated and did not generalize well to independent test samples. Our results showed a significant discrimination between ADHD and control samples for both adult and children, but the accuracies were modest (the area under the receiver operating characteristic curve (AUC): 66% and 67% respectively). We found that child samples were informative for predicting adult ADHD, and vice versa. The most important brain MRI structures for prediction were intracranial volume (ICV), followed by surface area and some subcortical volumes. The cortical thickness measurements were the least useful. Conclusions: Although previous ML classification studies reported overly optimistic accuracies and suffered methodological limitations, our results suggest that clinically useful classification of ADHD may be possible with larger samples. In contrast to prior reports of ENIGMA-ADHD studies, our work finds ADHD-related sMRI differences in adults and shows that the brain differences between cases and controls seen in youth can be useful in discriminating adults with and without ADHD. This provides additional evidence for the continuity of ADHD's pathophysiology from childhood to adulthood.

68: Real-time computation of the TMS-induced electric field in a realistic head model
more details view paper

Posted to bioRxiv 12 Feb 2019

Real-time computation of the TMS-induced electric field in a realistic head model
3 tweets neuroscience

Matti Stenroos, Lari M Koponen

Background: Transcranial magnetic stimulation (TMS) is often targeted using a model of TMS-induced electric field (E). In such navigated TMS, the E-field models have been based on spherical approximation of the head. Such models omit the effects of cerebrospinal fluid (CSF) on the E-field, leading to potentially large errors in the computed field. So far, realistic models have been too slow for interactive TMS navigation. Objective: We present computational methods that enable real-time solving of the E-field in a realistic head model that contains the CSF. Methods: Using reciprocity and Geselowitz integral equation, we separate the computations to coil-dependent and -independent parts. For the coil-dependent part of Geselowitz integrals, we present a fast numerical quadrature. Further, we present a moment-matching approach for optimizing dipole-based coil models. We verify the new methods using simulations in a realistic head model that contains the brain, CSF, skull, and scalp. Results: The new quadrature introduces a relative error of 1.1%. The total error of the quadrature and coil model was 1.43% and 1.15% for coils with 38 and 76 dipoles, respectively. The difference between our head model and a simpler realistic model that omits the CSF was 29%. Using a standard PC and a 38-dipole coil, our solver computed the E-field in 84 coil positions per second in 20000 points on the cortex. Conclusion: The presented methods enable real-time solving of the TMS-induced E-field in a realistic head model that contains the CSF. The new methodology allows more accurate targeting and precise adjustment of intensity during experimental or clinical TMS mapping.

69: Indexing De Bruijn graphs with minimizers
more details view paper

Posted to bioRxiv 11 Feb 2019

Indexing De Bruijn graphs with minimizers
3 tweets bioinformatics

Camille Marchet, Mael Kerbiriou, Antoine Limasset

Background: The need to associate information to words is shared among a plethora of applications and methods in high throughput sequence analysis, and could be marked as fundamental. A scalability problem is promptly met when indexing billions of k-mers, as exact associative indexes can be memory expensive. To leverage this challenge, recent works take advantage of the k-mer sets properties. They exploit the overlaps shared among k-mers by using a De Bruijn graph as a compact k-mer set. Contribution: We propose a scalable and exact index structure able to associate unique identifiers to indexed k-mers and to reject alien k-mers. The proposed structure combines an extremely compact representation along with a high throughput. Moreover, it can be efficiently built from the De Bruijn graph sequences. Using the efficient implementation of the index we provide, the k-mers from the human genome can be indexed with 8GB within 30 minutes. We achieve to index the huge axolotl genome with 63 GB within 10 hours. Furthermore, while being memory efficient, the index allows above a million queries per second on a single CPU in our experiments. This throughput can be raised using multiple cores. Finally, we also present the index ability to practically represent metagenomic and transcriptomic sequencing data. Availability: The index is implemented as a header-only library in C++ is open source and available at https://github.com/Malfoy/Blight. It was designed as a user-friendly library and comes along with sample code usage.

70: Selene: a PyTorch-based deep learning library for biological sequence-level data
more details view paper

Posted to bioRxiv 10 Oct 2018

Selene: a PyTorch-based deep learning library for biological sequence-level data
3 tweets bioinformatics

Kathleen M Chen, Evan M. Cofer, Jian Zhou, Olga G Troyanskaya

To enable the application of deep learning in biology, we present Selene (https://selene.flatironinstitute.org/), a PyTorch-based deep learning library for fast and easy development, training, and application of deep learning model architectures for any biological sequences. We demonstrate how Selene allows researchers to easily train a published architecture on new data, develop and evaluate a new architecture, and use a trained model to answer biological questions of interest.

71: MuStARD: Deep Learning for intra- and inter-species scanning of functional genomic patterns
more details view paper

Posted to bioRxiv 13 Feb 2019

MuStARD: Deep Learning for intra- and inter-species scanning of functional genomic patterns
3 tweets bioinformatics

Georgios K Georgakilas, Andrea Grioni, Konstantinos G Liakos, Eliska Malanikova, Fotis C Plessas, Panagiotis Alexiou

Regions of the genome that produce different classes of functional elements also exhibit different patterns in their sequence, secondary structure, and evolutionary conservation. Deep Learning is a family of Machine Learning algorithms recently applied to a variety of pattern recognition problems. Here we present MuStARD (gitlab.com/RBP_Bioinformatics/mustard) a Deep Learning framework that can learn and combine sequence, structure, and conservation patterns in sets of functional regions, and accurately identify additional members of the given set over wide genomic areas. MuStARD is designed with general use in mind, and has sophisticated iterative fully-automated background selection capability. We demonstrate that MuStARD can be trained without changes on different classes of human small RNA loci (pre-microRNAs and snoRNAs) and accurately build prediction models for both, outperforming state of the art methods specifically designed for each specific class. Furthermore, we demonstrate the ability of MuStARD for inter-species identification of functional elements by predicting mouse small RNAs using human trained models. MuStARD is easy to deploy and extend to a variety of genomic classification questions.

72: Abnormalities in Proinsulin Processing in Islets from Individuals with Longstanding T1D
more details view paper

Posted to bioRxiv 08 Feb 2019

Abnormalities in Proinsulin Processing in Islets from Individuals with Longstanding T1D
3 tweets pathology

Emily K. Sims, Julius Nyalwidhe, Farooq Syed, Henry T. Bahnson, Leena Haataja, Cate Speake, Margaret Morris, Raghavendra G Mirmira, Jerry Nadler, Teresa L. Mastracci, Peter Arvan, Carla J. Greenbaum, Carmella Evans-Molina

Work by our group and others has suggested that elevations in circulating proinsulin relative to C-peptide is associated with development of Type 1 diabetes (T1D). We recently described the persistence of detectable serum proinsulin in a large majority (95.9%) of individuals with longstanding T1D, including individuals with undetectable serum C-peptide. Here we describe analyses performed on human pancreatic sections from the nPOD collection (n=30) and isolated human islets (n=10) to further explore mechanistic etiologies of persistent proinsulin secretion in T1D. Compared to nondiabetic controls, immunostaining among a subset (4/9) of insulin positive T1D donor islets revealed increased numbers of cells with proinsulin-enriched, insulin-poor staining. Laser capture microdissection followed by mass spectrometry revealed reductions in the proinsulin processing enzymes prohormone convertase 1/3 (PC1/3) and carboxypeptidase E (CPE) in T1D donors. Twenty-four hour treatment of human islets with an inflammatory cytokine cocktail reduced mRNA expression of the processing enzymes PC1/3, PC2, and CPE. Taken together, these data provide new mechanistic insight into altered proinsulin processing in long-duration T1D and suggest that reduced β cell prohormone processing is associated with proinflammatory cytokine-induced reductions in proinsulin processing enzyme expression.

73: A genetically-encoded toolkit of functionalized nanobodies against fluorescent proteins for visualizing and manipulating intracellular signalling
more details view paper

Posted to bioRxiv 08 Feb 2019

A genetically-encoded toolkit of functionalized nanobodies against fluorescent proteins for visualizing and manipulating intracellular signalling
3 tweets cell biology

David L. Prole, Colin W Taylor

Background: Intrabodies enable targeting of proteins in live cells, but it remains a huge task to generate specific intrabodies against the thousands of proteins in a proteome. We leverage the widespread availability of fluorescently labelled proteins to visualize and manipulate intracellular signalling pathways in live cells by using nanobodies targeting fluorescent protein tags. Results: We generated a toolkit of plasmids encoding nanobodies against red and green fluorescent proteins (RFP and GFP variants), fused to functional modules. These include fluorescent sensors for visualization of Ca2+, H+ and ATP/ADP dynamics; oligomerizing or heterodimerizing modules that allow recruitment or sequestration of proteins and identification of membrane contact sites between organelles; SNAP tags that allow labelling with fluorescent dyes and targeted chromophore-assisted light inactivation; and nanobodies targeted to lumenal sub-compartments of the secretory pathway. We also developed two methods for crosslinking tagged proteins: a dimeric nanobody, and RFP-targeting and GFP-targeting nanobodies fused to complementary hetero-dimerizing domains. We show various applications of the toolkit and demonstrate, for example, that IP3 receptors deliver Ca2+ to the outer membrane of only a subset of mitochondria, and that only one or two sites on a mitochondrion form membrane contacts with the plasma membrane. Conclusions: This toolkit greatly expands the utility of intrabodies for studying cell signalling in live cells.

74: Linear Input as a Fundamental Motif of Brain Connectivity Organization
more details view paper

Posted to bioRxiv 14 Feb 2019

Linear Input as a Fundamental Motif of Brain Connectivity Organization
3 tweets neuroscience

Jonathan F O'Rawe, Hoi-Chung Leung

Describing the pattern of region-to-region functional connectivity is an important step towards understanding information transfer and transformation between brain regions. Although fMRI data are limited in spatial resolution, recent advances in technology afford more precise mapping. Here, we extended previous methods, connective field mapping, to 3 dimensions to provide a more concise estimate of the organization and potential information transformation from one region to another. We first replicated previous work with the 3 dimensional model by showing that the topology of functional connectivity between early visual regions maintained along their eccentricity axis or the anterior-posterior dimension. We then examined higher order visual regions (e,g, fusiform face area) and showed that their pattern of connectivity, the convergence and biased sampling, seem to contribute to some of their core receptive field properties. We further demonstrated that linearity of input is a fundamental aspect of functional connectivity of the whole brain, with higher linearity between regions within a network than across networks; that is, high connective linearity was evident between early visual areas, and between prefrontal areas, but less evident between them. By decomposing the whole brain linearity matrix with manifold learning techniques, we found that the principle mode of the linearity maps onto decompositions in both functional connectivity and genetic expression reported in previous studies. The current work provides evidence supporting that linearity of input is likely a fundamental motif of functional connectivity between regions for information processing across the brain, with high linearity preserving the integrity of information from one region to another within a network.

75: Efficacy of PCV vaccine is primarily mediated by controlling pneumococcal colonisation density
more details view paper

Posted to bioRxiv 14 Feb 2019

Efficacy of PCV vaccine is primarily mediated by controlling pneumococcal colonisation density
3 tweets microbiology

Esther L German, Carla Solorzano, Syba Sunny, Felicity Dunne, Jenna F Gritzfeld, Elena Mitsi, Elissavet Nikolaou, Angela D Hyder-Wright, Andrea M Collins, Stephen B Gordon, Daniela Ferreira

Widespread use of Pneumococcal Conjugate Vaccines (PCV) has resulted in a reduction in nasopharyngeal colonisation and invasive pneumococcal disease caused by vaccine-types. In a double-blind, randomised controlled trial using the Experimental Human Pneumococcal Challenge (EHPC) model, PCV-13 (Prevenar-13) conferred 78% protection against colonisation acquisition and a reduction in bacterial intensity (AUC) in experimentally colonised volunteers as measured by classical culture. In this study, we used a multiplex quantitative PCR assay targeting lytA and pneumococcal serotype 6A/B cpsA genes to re-assess the experimental colonisation status of the same trial volunteers. Increase in detection of low-density colonised volunteers by this molecular method led to a decrease of PCV efficacy against colonisation acquisition (29%), as compared to classical culture (83%). For subjects who were colonised following pneumococcal challenge, PCV had a pronounced effect on decreasing colonisation density. These results have implications for vaccine efficacy and surveillance studies as they indicate that the success of PCV vaccination could primarily be mediated by the control of vaccine-type colonisation density which results in decreased transmission and the reported herd effect of PCVs. Studies assessing the impact of PCV should account for density measurements in their design.

76: Analysis of genetically driven alternative splicing identifies FBXO38 as a novel COPD susceptibility gene
more details view paper

Posted to bioRxiv 14 Feb 2019

Analysis of genetically driven alternative splicing identifies FBXO38 as a novel COPD susceptibility gene
3 tweets genetics

Aabida Saferali, Jeong H. Yun, Margaret M Parker, Phuwanat Sakornsakolpat, Robert P Chase, Andrew Lamb, Brian D. Hobbs, Marike H Boezen, Xiangpeng Dai, Kim de Jong, Terri H Beaty, Wenyi Wei, Xiaobo Zhou, Edwin K Silverman, Michael H Cho, Peter J Castaldi, Craig P Hersh, COPDGene Investigators, and the International COPD Genetics Consortium Investigators

While many disease-associated single nucleotide polymorphisms (SNPs) are associated with gene expression (expression quantitative trait loci, eQTLs), a large proportion of complex disease genome-wide association study (GWAS) variants are of unknown function. Some of these SNPs may contribute to disease by regulating gene splicing. Here, we investigate whether SNPs that are associated with alternative splicing (splice QTL or sQTL) can identify novel functions for existing GWAS variants or suggest new associated variants in chronic obstructive pulmonary disease (COPD). RNA sequencing was performed on whole blood from 376 subjects from the COPDGene Study. Using linear models, we identified 561,060 unique sQTL SNPs associated with 30,333 splice sites corresponding to 6,419 unique genes. Similarly, 708,928 unique eQTL SNPs involving 15,913 genes were detected at 10% FDR. While there is overlap between sQTLs and eQTLs, 60% of sQTLs are not eQTLs. Co-localization analysis revealed that 7 out of 21 loci associated with COPD (p<1x10-6) in a published GWAS have at least one shared causal variant between the GWAS and sQTL studies. Among the genes identified to have splice sites associated with top GWAS SNPs was FBXO38, in which a novel exon was discovered to be protective against COPD. Importantly, the sQTL in this locus was validated by qPCR in both blood and lung tissue, demonstrating that splice variants relevant to lung tissue can be identified in blood. Other identified genes included CDK11A and SULT1A2. Overall, these data indicate that analysis of alternative splicing can provide novel insights into disease mechanisms. In particular, we demonstrated that SNPs in a known COPD GWAS locus on chromosome 5q32 influence alternative splicing in the gene FBXO38.

77: Patterns of African and Asian admixture in the Afrikaner population of South Africa
more details view paper

Posted to bioRxiv 07 Feb 2019

Patterns of African and Asian admixture in the Afrikaner population of South Africa
3 tweets evolutionary biology

Nina Hollfelder, Johannes Christoffel Erasmus, Rickard Hammaren, Mario Vicente, Mattias Jakobsson, Jaco M Greeff, Carina M Schlebusch

The Afrikaner population of South Africa are the descendants of European colonists who started to colonize the Cape of Good Hope in the 1600's. In the early days of the colony, mixed unions between European males and non-European females gave rise to admixed children who later became incorporated into either the Afrikaner or the "Coloured" populations of South Africa. Ancestry, social class, culture, sex ratio and geographic structure affected admixture patterns and caused different ancestry and admixture patterns in Afrikaner and Coloured populations. The Afrikaner population has a predominant European composition, whereas the Coloured population has more diverse ancestries. Genealogical records estimated the non-European contributions into the Afrikaners to 5.5%-7.2%. To investigate the genetic ancestry of the Afrikaner population today (11-13 generations after initial colonization) we genotyped ~5 million genome-wide markers in 77 Afrikaner individuals and compared their genotypes to populations across the world to determine parental source populations and admixture proportions. We found that the majority of Afrikaner ancestry (average 95.3%) came from European populations (specifically northwestern European populations), but that almost all Afrikaners had admixture from non-Europeans. The non-European admixture originated mostly from people who were brought to South Africa as slaves and, to a lesser extent, from local Khoe-San groups. Furthermore, despite a potentially small founding population, there is no sign of a recent bottleneck in the Afrikaner compared to other European populations. Admixture among diverse groups during early colonial times might have counterbalanced the effects of a founding population with a small census size.

78: Virtual ChIP-seq: Predicting transcription factor binding by learning from the transcriptome
more details view paper

Posted to bioRxiv 28 Feb 2018

Virtual ChIP-seq: Predicting transcription factor binding by learning from the transcriptome
3 tweets bioinformatics

Mehran Karimzadeh, Michael M. Hoffman

MOTIVATION: Identifying transcription factor binding sites is the first step in pinpointing non-coding mutations that disrupt the regulatory function of transcription factors and promote disease. ChIP-seq is the most common method for identifying binding sites, but performing it on patient samples is hampered by the amount of available biological material and the cost of the experiment. Existing methods for computational prediction of regulatory elements primarily predict binding in genomic regions with sequence similarity to known transcription factor sequence preferences. This has limited efficacy since most binding sites do not resemble known transcription factor sequence motifs, and many transcription factors are not even sequence-specific. RESULTS: We developed Virtual ChIP-seq, which predicts binding of individual transcription factors in new cell types using an artificial neural network that integrates ChIP-seq results from other cell types and chromatin accessibility data in the new cell type. Virtual ChIP-seq also uses learned associations between gene expression and transcription factor binding at specific genomic regions. This approach outperforms methods that predict TF binding solely based on sequence preference, predicting binding for 36 transcription factors (Matthews correlation coefficient} > 0.3). AVAILABILITY: The datasets we used for training and validation are available at https://virchip.hoffmanlab.org. We have deposited in Zenodo the current version of our software (http://doi.org/10.5281/zenodo.1066928), datasets (http://doi.org/10.5281/zenodo.823297), predictions for 36 transcription factors on Roadmap Epigenomics cell types (http://doi.org/10.5281/zenodo.1455759), and predictions in Cistrome as well as ENCODE-DREAM in vivo TF Binding Site Prediction Challenge (http://doi.org/10.5281/zenodo.1209308).

79: Species abundance information improves sequence taxonomy classification accuracy
more details view paper

Posted to bioRxiv 03 Sep 2018

Species abundance information improves sequence taxonomy classification accuracy
3 tweets bioinformatics

Benjamin D Kaehler, Nicholas Bokulich, Daniel McDonald, Rob Knight, J Gregory Caporaso, Gavin Austin Huttley

Popular naive Bayes taxonomic classifiers for amplicon sequences assume that all species in the reference database are equally likely to be observed. We demonstrate that classification accuracy degrades linearly with the degree to which that assumption is violated, and in practice it is always violated. By incorporating environment-specific taxonomic abundance information, we demonstrate that species-level resolution is attainable.

80: Optogenetic control reveals differential promoter interpretation of transcription factor nuclear translocation dynamics
more details view paper

Posted to bioRxiv 13 Feb 2019

Optogenetic control reveals differential promoter interpretation of transcription factor nuclear translocation dynamics
3 tweets systems biology

Susan Y Chen, Lindsey C Osimiri, Michael W Chevalier, Lukasz J Bugaj, Andrew H Ng, Jacob Stewart-Ornstein, Lauren T Neves, Hana El-Samad

The dynamic translocation of transcription factors (TFs) in and out of the nucleus is thought to encode information, such as the identity of a stimulus. A corollary is the idea that gene promoters can decode different dynamic TF translocation patterns. Testing this TF encoding/promoter decoding hypothesis requires tools that allow direct control of TF dynamics without the pleiotropic effects associated with general perturbations. In this work, we present CLASP (Controllable Light Activated Shuttling and Plasma membrane sequestration), a tool that enables precise, modular, and reversible control of TF localization using a combination of two optimized LOV2 optogenetic constructs. The first sequesters the cargo in the dark at the plasma membrane and releases it upon exposure to blue light, while light exposure of the second reveals a nuclear localization sequence that shuttles the released cargo to the nucleus. CLASP achieves minute-level resolution, reversible translocation of many TF cargos, large dynamic range, and tunable target gene expression. Using CLASP, we investigate the relationship between Crz1, a naturally pulsatile TF, and its cognate promoters. We establish that some Crz1 target genes respond more efficiently to pulsatile TF inputs than to continuous inputs, while others exhibit the opposite behavior. We show using computational modeling that efficient gene expression in response to short pulsing requires fast promoter activation and slow inactivation and that the opposite phenotype can ensue from a multi-stage promoter activation, where a transition in the first stage is thresholded. These data directly demonstrate differential interpretation of TF pulsing dynamics by different genes, and provide plausible models that can achieve these phenotypes.

Previous page 1 2 3 4 5 6 7 8 . . . 13 Next page

Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News