Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 85,245 bioRxiv papers from 366,748 authors.
Most downloaded bioRxiv papers, all time
in category molecular biology
2,797 results found. For more information, click each entry to expand.
33,646 downloads molecular biology
The emergence of a novel, highly pathogenic coronavirus, 2019-nCoV, in China, and its rapid national and international spread pose a global health emergency. Coronaviruses use their spike proteins to select and enter target cells and insights into nCoV-2019 spike (S)-driven entry might facilitate assessment of pandemic potential and reveal therapeutic targets. Here, we demonstrate that 2019-nCoV-S uses the SARS-coronavirus receptor, ACE2, for entry and the cellular protease TMPRSS2 for 2019-nCoV-S priming. A TMPRSS2 inhibitor blocked entry and might constitute a treatment option. Finally, we show that the serum from a convalescent SARS patient neutralized 2019-nCoV-S-driven entry. Our results reveal important commonalities between 2019-nCoV and SARS-coronavirus infection, which might translate into similar transmissibility and disease pathogenesis. Moreover, they identify a target for antiviral intervention.
25,424 downloads molecular biology
Forward genetic screens are powerful tools for the unbiased discovery and functional characterization of specific genetic elements associated with a phenotype of interest. Recently, the RNA-guided endonuclease Cas9 from the microbial immune system CRISPR (clustered regularly interspaced short palindromic repeats) has been adapted for genome-scale screening by combining Cas9 with guide RNA libraries. Here we describe a protocol for genome-scale knockout and transcriptional activation screening using the CRISPR-Cas9 system. Custom- or ready-made guide RNA libraries are constructed and packaged into lentivirus for delivery into cells for screening. As each screen is unique, we provide guidelines for determining screening parameters and maintaining sufficient coverage. To validate candidate genes identified from the screen, we further describe strategies for confirming the screening phenotype as well as genetic perturbation through analysis of indel rate and transcriptional activation. Beginning with library design, a genome-scale screen can be completed in 6-10 weeks followed by 3-4 weeks of validation.
22,677 downloads molecular biology
Rahul Sinha, Geoff Stanley, Gunsagar S. Gulati, Camille Ezran, Kyle J. Travaglini, Eric Wei, Charles K.F. Chan, Ahmad N. Nabhan, Tianying Su, Rachel M. Morganti, Stephanie D Conley, Hassan Chaib, Kristy Red-Horse, Michael T Longaker, Michael P. Snyder, Mark A Krasnow, Irving L. Weissman
Illumina-based next generation sequencing (NGS) has accelerated biomedical discovery through its ability to generate thousands of gigabases of sequencing output per run at a fraction of the time and cost of conventional technologies. The process typically involves four basic steps: library preparation, cluster generation, sequencing, and data analysis. In 2015, a new chemistry of cluster generation was introduced in the newer Illumina machines (HiSeq 3000/4000/X Ten) called exclusion amplification (ExAmp), which was a fundamental shift from the earlier method of random cluster generation by bridge amplification on a non-patterned flow cell. The ExAmp chemistry, in conjunction with patterned flow cells containing nanowells at fixed locations, increases cluster density on the flow cell, thereby reducing the cost per run. It also increases sequence read quality, especially for longer read lengths (up to 150 base pairs). This advance has been widely adopted for genome sequencing because greater sequencing depth can be achieved for lower cost without compromising the quality of longer reads. We show that this promising chemistry is problematic, however, when multiplexing samples. We discovered that up to 5-10% of sequencing reads (or signals) are incorrectly assigned from a given sample to other samples in a multiplexed pool. We provide evidence that this “spreading-of-signals” arises from low levels of free index primers present in the pool. These index primers can prime pooled library fragments at random via complementary 3′ ends, and get extended by DNA polymerase, creating a new library molecule with a new index before binding to the patterned flow cell to generate a cluster for sequencing. This causes the resulting read from that cluster to be assigned to a different sample, causing the spread of signals within multiplexed samples. We show that low levels of free index primers persist after the most common library purification procedure recommended by Illumina, and that the amount of signal spreading among samples is proportional to the level of free index primer present in the library pool. This artifact causes homogenization and misclassification of cells in single cell RNA-seq experiments. Therefore, all data generated in this way must now be carefully re-examined to ensure that “spreading-of-signals” has not compromised data analysis and conclusions. Re-sequencing samples using an older technology that uses conventional bridge amplification for cluster generation, or improved library cleanup strategies to remove free index primers, can minimize or eliminate this signal spreading artifact.
13,508 downloads molecular biology
Yuancheng Lu, Anitha Krishnan, Benedikt Brommer, Xiao Tian, Margarita Meer, Daniel L. Vera, Chen Wang, Qiurui Zeng, Doudou Yu, Michael S. Bonkowski, Jae-Hyun Yang, Emma M. Hoffmann, Songlin Zhou, Ekaterina Korobkina, Noah Davidsohn, Michael B. Schultz, Karolina Chwalek, Luis A. Rajman, George M. Church, Konrad Hochedlinger, Vadim N. Gladyshev, Steve Horvath, Meredith S. Gregory-Ksander, Bruce R. Ksander, Zhigang He, David A. Sinclair
Ageing is a degenerative process leading to tissue dysfunction and death. A proposed cause of ageing is the accumulation of epigenetic noise, which disrupts youthful gene expression patterns that are required for cells to function optimally and recover from damage. Changes to DNA methylation patterns over time form the basis of an 'ageing clock', but whether old individuals retain information to reset the clock and, if so, whether this would improve tissue function is not known. Of all the tissues in the body, the central nervous system (CNS) is one of the first to lose regenerative capacity. Using the eye as a model tissue, we show that expression of Oct4, Sox2, and Klf4 genes (OSK) in mice resets youthful gene expression patterns and the DNA methylation age of retinal ganglion cells, promotes axon regeneration after optic nerve crush injury, and restores vision in a mouse model of glaucoma and in normal old mice. This process, which we call recovery of information via epigenetic reprogramming or REVIVER, requires the DNA demethylases Tet1 and Tet2, indicating that DNA methylation patterns don't just indicate age, they participate in ageing. Thus, old tissues retain a faithful record of youthful epigenetic information that can be accessed for functional age reversal.
11,558 downloads molecular biology
Glycans modify lipids and proteins to mediate inter- and intramolecular interactions across all domains of life. RNA, another multifaceted biopolymer, is not thought to be a major target of glycosylation. Here, we challenge this view with evidence that mammalian cells use RNA as a third scaffold for glycosylation in the secretory pathway. Using a battery of chemical and biochemical approaches, we find that a select group of small noncoding RNAs including Y RNAs are modified with complex, sialylated N-glycans (glycoRNAs). These glycoRNA are present in multiple cell types and mammalian species, both in cultured cells and in vivo. Finally, we find that RNA glycosylation depends on the canonical N-glycan biosynthetic machinery within the ER/Golgi luminal spaces. Collectively, these findings suggest the existence of a ubiquitous interface of RNA biology and glycobiology suggesting an expanded role for glycosylation beyond canonical lipid and protein scaffolds.
11,453 downloads molecular biology
The ongoing COVID-19 pandemic has already caused devastating losses. Early evidence shows that the exponential spread of COVID-19 can be slowed by restrictive isolation measures, but these place a tremendous burden on society. Moreover, once these restrictions are lifted, the exponential spread is likely to re-emerge. It has been suggested that population-scale testing can help break the cycle of isolation and spread, but current detection methods are not capable of such large-scale processing. Here we propose LAMP-Seq, a barcoded Reverse-Transcription Loop-mediated Isothermal Amplification (RT-LAMP) protocol that could dramatically reduce the cost and complexity of population-scale testing. In this approach, individual samples are processed in a single heat step, producing barcoded amplicons that can be shipped to a sequencing center, pooled, and analyzed en masse. Using unique barcode combinations per sample from a compressed barcode space enables extensive pooling, significantly reducing cost and organizational efforts. Given the low cost and scalability of next-generation sequencing, we believe that this method can be affordably scaled to analyze millions of samples per day using existing sequencing infrastructure.
10,581 downloads molecular biology
CRISPR/Cas technologies have transformed our ability to manipulate genomes for research and gene-based therapy. In particular, homology-directed repair after genomic cleavage allows for precise modification of genes using exogenous donor sequences as templates. While both single-stranded DNA (ssDNA) and double-stranded DNA (dsDNA) forms of donors have been used as repair templates, a systematic comparison of the performance and specificity of repair using ssDNA versus dsDNA donors is still lacking. Here, we describe an optimized method for the synthesis of long ssDNA templates and demonstrate that ssDNA donors can drive efficient integration of gene-sized reporters in human cell lines. We next define a set of rules to maximize the efficiency of ssDNA-mediated knock-in by optimizing donor design. Finally, by comparing ssDNA donors with equivalent dsDNA sequences (PCR products or plasmids), we demonstrate that ssDNA templates have a unique advantage in terms of repair specificity while dsDNA donors can lead to a high rate of off-target integration. Our results provide a framework for designing high-fidelity CRISPR-based knock-in experiments, in both research and therapeutic settings. Update: November 12th, 2019 Dear bioRxiv community, The conclusions of this pre-print (originally posted in August 2017) are outdated. While the experiments we present here are accurate, a recent and more systematic analysis revealed that the integration outcomes driven by different forms of HDR donors are more complex than our methods could originally identify. We initially analyzed donor integration only in FACS-selected cells, which under-estimates alleles where the mis-integration of payload leads to non-functional selection markers, and we quantified integration by ddPCR, which is an indirect read-out of sequence properties. These approaches could not capture the full details of donor integration events in our experiments. To address this, we have now developed a new framework based on long-read amplicon sequencing and an integrated computational pipeline to precisely analyze knock-in repair outcomes across a wide range of experimental parameters. Our new data uncover a complex repair landscape in which both single-stranded and double-stranded donors can lead to high rates of imprecise integration in some cell types. Please read our new bioRxiv pre-print entitled “Deep profiling reveals substantial heterogeneity of integration outcomes in CRISPR knock-in experiments” for further information. I hope that this example highlights one of the powers of pre-prints: the ability to update scientific discussions (and set records straight) as new results are obtained, often fueled by the availability of new technologies. Please do not hesitate to contact me directly for any questions or comments.
10,528 downloads molecular biology
Daniel J Butler, Christopher Mozsary, Cem Meydan, David Danko, Jonathan Foox, Joel Rosiene, Alon Shaiber, Ebrahim Afshinnekoo, Matthew MacKay, Fritz J. Sedlazeck, Nikolay A Ivanov, Maria Sierra, Diana Pohle, Michael Zietz, Undina Gisladottir, Vijendra Ramlall, Craig D Westover, Krista Ryon, Benjamin Young, Chandrima Bhattacharya, Phyllis Ruggiero, Bradley W. Langhorst, Nathan Tanner, Justyna Gawrys, Dmitry Meleshko, Dong Xu, Peter A D Steel, Amos J Shemesh, Jenny Xiang, Jean Thierry-Mieg, Danielle Thierry-Mieg, Robert E. Schwartz, Angelika Iftner, Daniela Bezdan, John Sipley, Lin Cong, Arryn Craney, Priya Velu, Ari M. Melnick, Iman Hajirasouliha, Stacy M. Horner, Thomas Iftner, Mirella Salvatore, Massimo Loda, Lars F Westblade, Melissa Cushing, Shawn Levy, Shixiu Wu, Nicholas P. Tatonetti, Marcin Imielinski, Hanna Rennert, Christopher E. Mason
The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has caused thousands of deaths worldwide, including >18,000 in New York City (NYC) alone. The sudden emergence of this pandemic has highlighted a pressing clinical need for rapid, scalable diagnostics that can detect infection, interrogate strain evolution, and identify novel patient biomarkers. To address these challenges, we designed a fast (30-minute) colorimetric test (LAMP) for SARS-CoV-2 infection from naso/oropharyngeal swabs, plus a large-scale shotgun metatranscriptomics platform (total-RNA-seq) for host, bacterial, and viral profiling. We applied both technologies across 857 SARS-CoV-2 clinical specimens and 86 NYC subway samples, providing a broad molecular portrait of the COVID-19 NYC outbreak. Our results define new features of SARS-CoV-2 evolution, nominate a novel, NYC-enriched viral subclade, reveal specific host responses in ACE, interferon, hematological, and olfaction pathways, and examine risks associated with use of ACE inhibitors and angiotensin receptor blockers. Together, these findings have immediate applications to SARS-CoV-2 diagnostics, public health monitoring, and new therapeutic targets. ### Competing Interest Statement N.T. and B.L. are employees at New England Biolabs.
10,062 downloads molecular biology
We previously described a novel alternative to Chromatin Immunoprecipitation, Cleavage Under Targets & Release Using Nuclease (CUT&RUN), in which unfixed permeabilized cells are incubated with antibody, followed by binding of a Protein A-Micrococcal Nuclease (pA/MNase) fusion protein (). Upon activation of tethered MNase, the bound complex is excised and released into the supernatant for DNA extraction and sequencing. Here we introduce four enhancements to CUT&RUN: 1) a hybrid Protein A-Protein G-MNase construct that expands antibody compatibility and simplifies purification; 2) a modified digestion protocol that inhibits premature release of the nuclease-bound complex; 3) a calibration strategy based on carry-over of E. coli DNA introduced with the fusion protein; and 4) a novel peak-calling strategy customized for the low-background profiles obtained using CUT&RUN. These new features, coupled with the previously described low-cost, high efficiency, high reproducibility and high-throughput capability of CUT&RUN make it the method of choice for routine epigenomic profiling. : #ref-1
7,911 downloads molecular biology
RNAs have important and diverse functions. Visualizing an isolated RNA in living cells provide us essential information of its roles. By now, there are two kinds of live RNA imaging systems invented, one is the MS2 system and the other is the Cas13a system. In this study, we show that when fused with split-Fp, CasE can be engineered into a live RNA tracking tool.
7,538 downloads molecular biology
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has received global attention due to the recent outbreak in China. In this work, we report a CRISPR-Cas12 based diagnostic tool to detect synthetic SARS-CoV-2 RNA sequences in a proof-of-principle evaluation. The test proved to be sensitive, rapid, and potentially portable. These key traits of the CRISPR method are critical for virus detection in regions that lack resources to use the currently available methods.
7,486 downloads molecular biology
High-throughput amplicon sequencing of large genomic regions remains challenging for short-read technologies. Here, we report a high-throughput amplicon sequencing approach combining unique molecular identifiers (UMIs) with Oxford Nanopore Technologies or Pacific Biosciences CCS sequencing, yielding high accuracy single-molecule consensus sequences of large genomic regions. Our approach generates amplicon and genomic sequences of >10,000 bp in length with a mean error-rate of 0.0049-0.0006% and chimera rate <0.022%.
6,946 downloads molecular biology
Christopher D Go, James D.R. Knight, Archita Rajasekharan, Bhavisha Rathod, Geoffrey G Hesketh, Kento T Abe, Ji-Young Youn, Payman Samavarchi-Tehrani, Hui Zhang, Lucie Y Zhu, Evelyn Popiel, Jean-Philippe Lambert, Étienne Coyaud, Sally W.T. Cheung, Dushyandi Rajendran, Cassandra J Wong, Hana Antonicka, Laurence Pelletier, Brian Raught, Alexander F Palazzo, Eric A Shoubridge, Anne-Claude Gingras
Compartmentalization is an essential characteristic of eukaryotic cells, ensuring that cellular processes are partitioned to defined subcellular locations. High throughput microscopy and biochemical fractionation coupled with mass spectrometry have helped to define the proteomes of multiple organelles and macromolecular structures. However, many compartments have remained refractory to such methods, partly due to lysis and purification artefacts and poor subcompartment resolution. Recently developed proximity-dependent biotinylation approaches such as BioID and APEX provide an alternative avenue for defining the composition of cellular compartments in living cells. Here we report an extensive BioID-based proximity map of a human cell, comprising 192 markers from 32 different compartments that identifies 35,902 unique high confidence proximity interactions and localizes 4,145 proteins expressed in HEK293 cells. The recall of our localization predictions is on par with or better than previous large-scale mass spectrometry and microscopy approaches, but with higher localization specificity. In addition to assigning compartment and subcompartment localization for many previously unlocalized proteins, our data contain fine- grained localization information that, for example, allowed us to identify proteins with novel roles in mitochondrial dynamics. As a community resource, we have created humancellmap.org, a website that allows exploration of our data in detail, and aids with the analysis of BioID experiments.
6,721 downloads molecular biology
Nucleic acid stains are necessary for Agarose Gel Electrophoresis (AGE). The commonly used but mutagenic Ethidium Bromide is being usurped by a range of safer but more expensive alternatives. These safe stains vary in cost, sensitivity and the impedance of DNA as it migrates through the gel. Modified protocols developed to reduce cost increase this variability. In this study, five Gel stains (GelRed™, GelGreen™, SYBR™ safe, SafeView and EZ-Vision®In-Gel Solution) two premixed loading dyes (SafeWhite, EZ-Vision®One) and four methods (pre-loading at 100x, pre-loading at 10x, precasting and post-staining) are evaluated for sensitivity and effect on DNA migration. GelRed™ was found to be the most sensitive while the EZ-Vision® dyes and SafeWhite had no discernible effect on DNA migration. Homemade loading dyes were as effective as readymade ones at less than 4% of the price. This method used less than 1% of the dye needed for the manufacturer recommended protocols. Thus, with careful consideration of stain and method, Gel stain expenditure can be reduced by over 99%.
6,485 downloads molecular biology
Wanchao Yin, Chunyou Mao, Xiaodong Luan, Dan-Dan Shen, Qingya Shen, Haixia Su, Xiaoxi Wang, Fulai Zhou, Wenfeng Zhao, Minqi Gao, Shenghai Chang, Yuan-Chao Xie, Guanghui Tian, He-Wei Jiang, Sheng-Ce Tao, Jingshan Shen, Yi Jiang, Hualiang Jiang, H. Eric Xu, Shuyang Zhang, Yan Zhang, H. Eric Xu
The pandemic of Corona Virus Disease 2019 (COVID-19) caused by SARS-CoV-2 has become a global crisis. The replication of SARS-CoV-2 requires the viral RNA-dependent RNA polymerase (RdRp), a direct target of the antiviral drug, Remdesivir. Here we report the structure of the SARS-CoV-2 RdRp either in the apo form or in complex with a 50-base template-primer RNA and Remdesivir at a resolution range of 2.5-2.8 Å. The complex structure reveals that the partial double-stranded RNA template is inserted into the central channel of the RdRp where Remdesivir is incorporated into the first replicated base pair and terminates the chain elongation. Our structures provide critical insights into the working mechanism of viral RNA replication and a rational template for drug design to combat the viral infection. ### Competing Interest Statement The authors have declared no competing interest.
5,487 downloads molecular biology
The COVID-19 disease has plagued over 110 countries and has resulted in over 4,000 deaths within 10 weeks. We compare the interaction between the human ACE2 receptor and the SARS-CoV-2 spike protein with that of other pathogenic coronaviruses using molecular dynamics simulations. SARS-CoV, SARS-CoV-2, and HCoV-NL63 recognize ACE2 as the natural receptor but present a distinct binding interface to ACE2 and a different network of residue-residue contacts. SARS-CoV and SARS-CoV-2 have comparable binding affinities achieved by balancing energetics and dynamics. The SARS-CoV-2 - ACE2 complex contains a higher number of contacts, a larger interface area, and decreased interface residue fluctuations relative to SARS-CoV. These findings expose an exceptional evolutionary exploration exerted by coronaviruses toward host recognition. We postulate that the versatility of cell receptor binding strategies has immediate implications on therapeutic strategies.
5,439 downloads molecular biology
The outbreak of coronavirus disease (COVID-19) in China caused by SARS-CoV-2 virus continually lead to worldwide human infections and deaths. It is currently no specific viral protein targeted therapeutics yet. Viral nucleocapsid protein is a potential antiviral drug target, serving multiple critical functions during the viral life cycle. However, the structural information of SARS-CoV-2 nucleocapsid protein is yet to be clear. Herein, we have determined the 2.7 Å crystal structure of the N-terminal RNA binding domain of SARS-CoV-2 nucleocapsid protein. Although overall structure is similar with other reported coronavirus nucleocapsid protein N-terminal domain, the surface electrostatic potential characteristics between them are distinct. Further comparison with mild virus type HCoV-OC43 equivalent domain demonstrates a unique potential RNA binding pocket alongside the β-sheet core. Complemented by in vitro binding studies, our data provide several atomic resolution features of SARS-CoV-2 nucleocapsid protein N-terminal domain, guiding the design of novel antiviral agents specific targeting to SARS-CoV-2.
5,357 downloads molecular biology
Pinar Akcakaya, Maggie L. Bobbin, Jimmy A. Guo, Jose M Lopez, M. Kendell Clement, Sara P. Garcia, Mick D. Fellows, Michelle J. Porritt, Mike A. Firth, Alba Carreras, Tania Baccega, Frank Seeliger, Mikael Bjursell, Shengdar Q. Tsai, Nhu T. Nguyen, Roberto Nitsch, Lorenz M Mayr, Luca Pinello, Mohammad Bohlooly-Y, Martin J Aryee, Marcello Maresca, J. Keith Joung
CRISPR-Cas genome-editing nucleases hold substantial promise for human therapeutics but identifying unwanted off-target mutations remains an important requirement for clinical translation. For ex vivo therapeutic applications, previously published cell-based genome-wide methods provide potentially useful strategies to identify and quantify these off-target mutation sites. However, a well-validated method that can reliably identify off-targets in vivo has not been described to date, leaving the question of whether and how frequently these types of mutations occur. Here we describe Verification of In Vivo Off-targets (VIVO), a highly sensitive, unbiased, and generalizable strategy that we show can robustly identify genome-wide CRISPR-Cas nuclease off-target effects in vivo. To our knowledge, these studies provide the first demonstration that CRISPR-Cas nucleases can induce substantial off-target mutations in vivo, a result we obtained using a deliberately promiscuous guide RNA (gRNA). More importantly, we used VIVO to show that appropriately designed gRNAs can direct efficient in vivo editing without inducing detectable off-target mutations. Our findings provide strong support for and should encourage further development of in vivo genome editing therapeutic strategies.
5,224 downloads molecular biology
Severe Acute Respiratory Syndrome Coronavirus 2 is rapidly spreading around the world. There is no existing vaccine or proven drug to prevent infections and stop virus proliferation. Although this virus is similar to human and animal SARS- and MERS-CoVs the detailed information about SARS-CoV-2 proteins structures and functions is urgently needed to rapidly develop effective vaccines, antibodies and antivirals. We applied high-throughput protein production and structure determination pipeline at the Center for Structural Genomics of Infectious Diseases to produce SARS-CoV-2 proteins and structures. Here we report the high-resolution crystal structure of endoribonuclease Nsp15/NendoU from SARS-CoV-2 - a virus causing current world-wide epidemics. We compare this structure with previously reported models of Nsp15 from SARS and MERS coronaviruses.
5,214 downloads molecular biology
The study of microbial communities has been revolutionised in recent years by the widespread adoption of culture independent analytical techniques such as 16S rRNA gene sequencing and metagenomics. One potential confounder of these sequence-based approaches is the presence of contamination in DNA extraction kits and other laboratory reagents. In this study we demonstrate that contaminating DNA is ubiquitous in commonly used DNA extraction kits, varies greatly in composition between different kits and kit batches, and that this contamination critically impacts results obtained from samples containing a low microbial biomass. Contamination impacts both PCR-based 16S rRNA gene surveys and shotgun metagenomics. These results suggest that caution should be advised when applying sequence-based techniques to the study of microbiota present in low biomass environments. We provide an extensive list of potential contaminating genera, and guidelines on how to mitigate the effects of contamination. Concurrent sequencing of negative control samples is strongly advised.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!