1: Uncanny similarity of unique inserts in the 2019-nCoV spike protein to HIV-1 gp120 and Gag
Posted 31 Jan 2020

Uncanny similarity of unique inserts in the 2019-nCoV spike protein to HIV-1 gp120 and Gag
1,574,130 downloads bioRxiv evolutionary biology

Prashant Pradhan, Ashutosh Kumar Pandey, Akhilesh Mishra, Parul Gupta, Praveen Kumar Tripathi, Manoj Balakrishnan Menon, James Gomes, Perumal Vivekanandan, Bishwajit Kundu

This paper has been withdrawn by its authors. They intend to revise it in response to comments received from the research community on their technical approach and their interpretation of the results. If you have any questions, please contact the corresponding author.

2: Spike mutation pipeline reveals the emergence of a more transmissible form of SARS-CoV-2
Posted 30 Apr 2020

Spike mutation pipeline reveals the emergence of a more transmissible form of SARS-CoV-2
241,629 downloads bioRxiv evolutionary biology

B Korber, WM Fischer, S Gnanakaran, H Yoon, J Theiler, W Abfalterer, B Foley, EE Giorgi, T Bhattacharya, MD Parker, DG Partridge, CM Evans, TM Freeman, Thushan I de Silva, on behalf of the Sheffield COVID-19 Genomics Group, CC LaBranche, DC Montefiori

We have developed an analysis pipeline to facilitate real-time mutation tracking in SARS-CoV-2, focusing initially on the Spike (S) protein because it mediates infection of human cells and is the target of most vaccine strategies and antibody-based therapeutics. To date we have identified fourteen mutations in Spike that are accumulating. Mutations are considered in a broader phylogenetic context, geographically, and over time, to provide an early warning system to reveal mutations that may confer selective advantages in transmission or resistance to interventions. Each one is evaluated for evidence of positive selection, and the implications of the mutation are explored through structural modeling. The mutation Spike D614G is of urgent concern; after beginning to spread in Europe in early February, when introduced to new regions it repeatedly and rapidly becomes the dominant form. Also, we present evidence of recombination between locally circulating strains, indicative of multiple strain infections. These finding have important implications for SARS-CoV-2 transmission, pathogenesis and immune interventions. ### Competing Interest Statement The authors have declared no competing interest.

3: Analysis of the mutation dynamics of SARS-CoV-2 reveals the spread history and emergence of RBD mutant with lower ACE2 binding affinity
Posted 11 Apr 2020

Analysis of the mutation dynamics of SARS-CoV-2 reveals the spread history and emergence of RBD mutant with lower ACE2 binding affinity
36,076 downloads bioRxiv evolutionary biology

YONG JIA, Gangxu Shen, Stephanie Nguyen, Yujuan Zhang, Keng-Shiang Huang, Hsing-Ying Ho, Wei-Shio Hor, Chih-Hui Yang, John B Bruning, Chengdao Li, Wei-Lung Wang

Monitoring the mutation dynamics of SARS-CoV-2 is critical for the development of effective approaches to contain the pathogen. By analyzing 106 SARS-CoV-2 and 39 SARS genome sequences, we provided direct genetic evidence that SARS-CoV-2 has a much lower mutation rate than SARS. Minimum Evolution phylogeny analysis revealed the putative original status of SARS-CoV-2 and the early-stage spread history. The discrepant phylogenies for the spike protein and its receptor binding domain proved a previously reported structural rearrangement prior to the emergence of SARS-CoV-2. Despite that we found the spike glycoprotein of SARS-CoV-2 is particularly more conserved, we identified a mutation that leads to weaker receptor binding capability, which concerns a SARS-CoV-2 sample collected on 27th January 2020 from India. This represents the first report of a significant SARS-CoV-2 mutant, and and requires attention from researchers working on vaccine development around the world.

4: SARS-CoV-2 is well adapted for humans. What does this mean for re-emergence?
Posted 02 May 2020

SARS-CoV-2 is well adapted for humans. What does this mean for re-emergence?
26,483 downloads bioRxiv evolutionary biology

Shing Hei Zhan, Benjamin E. Deverman, Yujia Alina Chan

In a side-by-side comparison of evolutionary dynamics between the 2019/2020 SARS-CoV-2 and the 2003 SARS-CoV, we were surprised to find that SARS-CoV-2 resembles SARS-CoV in the late phase of the 2003 epidemic after SARS-CoV had developed several advantageous adaptations for human transmission. Our observations suggest that by the time SARS-CoV-2 was first detected in late 2019, it was already pre-adapted to human transmission to an extent similar to late epidemic SARS-CoV. However, no precursors or parallel branches of evolution stemming from a less human-adapted SARS-CoV-2-like virus have been detected. The sudden appearance of a highly infectious SARS-CoV-2 presents a major cause for concern that should motivate stronger international efforts to identify the source and prevent near future re-emergence. Any existing pools of SARS-CoV-2 progenitors would be particularly dangerous if similarly well adapted for human transmission. To look for clues regarding intermediate hosts, we analyze recent key findings relating to how SARS-CoV-2 could have evolved and adapted for human transmission, and examine the environmental samples from the Wuhan Huanan seafood market. Importantly, the market samples are genetically identical to human SARS-CoV-2 isolates and were therefore most likely from human sources. We conclude by describing and advocating for measured and effective approaches implemented in the 2002-2004 SARS outbreaks to identify lingering population(s) of progenitor virus. ### Competing Interest Statement Shing Hei Zhan is a Co-founder and lead bioinformatics scientist at Fusion Genomics Corporation, which develops molecular diagnostic assays for infectious diseases.

5: Quantitative translation of dog-to-human aging by conserved remodeling of epigenetic networks
Posted 04 Nov 2019

Quantitative translation of dog-to-human aging by conserved remodeling of epigenetic networks
23,797 downloads bioRxiv evolutionary biology

Tina Wang, Jianzhu Ma, Andrew N. Hogan, Samson Fong, Katherine Licon, Brian Y Tsui, Jason F. Kreisberg, Peter D. Adams, Anne-Ruxandra Carvunis, Danika L. Bannasch, Elaine A. Ostrander, Trey Ideker

Mammals progress through similar physiological stages during life, from early development to puberty, aging, and death. Yet, the extent to which this conserved physiology reflects conserved molecular events is unclear. Here, we map common epigenetic changes experienced by mammalian genomes as they age, focusing on evolutionary comparisons of humans to dogs, an emerging model of aging. Using targeted sequencing, we characterize the methylomes of 104 Labrador retrievers spanning a 16 year age range, achieving >150X coverage within mammalian syntenic blocks. Comparison with human methylomes reveals a nonlinear relationship which translates dog to human years, aligns the timing of major physiological milestones between the two species, and extends to mice. Conserved changes center on specific developmental gene networks which are sufficient to capture the effects of anti-aging interventions in multiple mammals. These results establish methylation not only as a diagnostic age readout but as a cross-species translator of physiological aging milestones.

6: Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic
Posted 31 Mar 2020

Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic
16,262 downloads bioRxiv evolutionary biology

Maciej F. Boni, Philippe Lemey, Xiaowei Jiang, Tommy Tsan-Yuk Lam, Blair Perry, Todd Castoe, Andrew Rambaut, David L Robertson

There are outstanding evolutionary questions on the recent emergence of coronavirus SARS-CoV-2/hCoV-19 in Hubei province that caused the COVID-19 pandemic, including (1) the relationship of the new virus to the SARS-related coronaviruses, (2) the role of bats as a reservoir species, (3) the potential role of other mammals in the emergence event, and (4) the role of recombination in viral emergence. Here, we address these questions and find that the sarbecoviruses -- the viral subgenus responsible for the emergence of SARS-CoV and SARS-CoV-2 -- exhibit frequent recombination, but the SARS-CoV-2 lineage itself is not a recombinant of any viruses detected to date. In order to employ phylogenetic methods to date the divergence events between SARS-CoV-2 and the bat sarbecovirus reservoir, recombinant regions of a 68-genome sarbecovirus alignment were removed with three independent methods. Bayesian evolutionary rate and divergence date estimates were consistent for all three recombination-free alignments and robust to two different prior specifications based on HCoV-OC43 and MERS-CoV evolutionary rates. Divergence dates between SARS-CoV-2 and the bat sarbecovirus reservoir were estimated as 1948 (95% HPD: 1879-1999), 1969 (95% HPD: 1930-2000), and 1982 (95% HPD: 1948-2009). Despite intensified characterization of sarbecoviruses since SARS, the lineage giving rise to SARS-CoV-2 has been circulating unnoticed for decades in bats and been transmitted to other hosts such as pangolins. The occurrence of a third significant coronavirus emergence in 17 years together with the high prevalence and virus diversity in bats implies that these viruses are likely to cross species boundaries again.

7: An ancient viral epidemic involving host coronavirus interacting genes more than 20,000 years ago in East Asia
Posted 16 Nov 2020

An ancient viral epidemic involving host coronavirus interacting genes more than 20,000 years ago in East Asia
14,237 downloads bioRxiv evolutionary biology

Yassine Souilmi, M. Elise Lauterbur, Ray Tobler, Christian D Huber, Angad S. Johar, David Enard

The current SARS-CoV-2 pandemic has emphasized the vulnerability of human populations to novel viral pressures, despite the vast array of epidemiological and biomedical tools now available. Notably, modern human genomes contain evolutionary information tracing back tens of thousands of years, which may help identify the viruses that have impacted our ancestors -- pointing to which viruses have future pandemic potential. Here, we apply evolutionary analyses to human genomic datasets to recover selection events involving tens of human genes that interact with coronaviruses, including SARS-CoV-2, that likely started more than 20,000 years ago. These adaptive events were limited to the population ancestral to East Asian populations. Multiple lines of functional evidence support an ancient viral selective pressure, and East Asia is the geographical origin of several modern coronavirus epidemics. An arms race with an ancient coronavirus, or with a different virus that happened to use similar interactions as coronaviruses with human hosts, may thus have taken place in ancestral East Asian populations. By learning more about our ancient viral foes, our study highlights the promise of evolutionary information to better predict the pandemics of the future. Importantly, adaptation to ancient viral epidemics in specific human populations does not necessarily imply any difference in genetic susceptibility between different human populations, and the current evidence points toward an overwhelming impact of socioeconomic factors in the case of COVID-19.

8: Origin and cross-species transmission of bat coronaviruses in China
Posted 31 May 2020

Origin and cross-species transmission of bat coronaviruses in China
12,480 downloads bioRxiv evolutionary biology

Alice Latinne, Ben Hu, Kevin J. Olival, Guangjian Zhu, Libiao Zhang, Hongying Li, Aleksei A Chmura, Hume E Field, Carlos Zambrana-Torrelio, Jonathan H Epstein, Bei Li, Wei Zhang, Lin-Fa Wang, Zheng-Li Shi, Peter Daszak

Bats are presumed reservoirs of diverse coronaviruses (CoVs) including progenitors of Severe Acute Respiratory Syndrome (SARS)-CoV and SARS-CoV-2, the causative agent of COVID-19. However, the evolution and diversification of these coronaviruses remains poorly understood. We used a Bayesian statistical framework and sequence data from all known bat-CoVs (including 630 novel CoV sequences) to study their macroevolution, cross-species transmission, and dispersal in China. We find that host-switching was more frequent and across more distantly related host taxa in alpha- than beta-CoVs, and more highly constrained by phylogenetic distance for beta-CoVs. We show that inter-family and -genus switching is most common in Rhinolophidae and the genus Rhinolophus. Our analyses identify the host taxa and geographic regions that define hotspots of CoV evolutionary diversity in China that could help target bat-CoV discovery for proactive zoonotic disease surveillance. Finally, we present a phylogenetic analysis suggesting a likely origin for SARS-CoV-2 in Rhinolophus spp. bats. ### Competing Interest Statement The authors have declared no competing interest.

9: Population Replacement in Early Neolithic Britain
Posted 18 Feb 2018

Population Replacement in Early Neolithic Britain
11,345 downloads bioRxiv evolutionary biology

Selina Brace, Yoan Diekmann, Thomas J. Booth, Zuzana Faltyskova, Nadin Rohland, Swapan Mallick, Matthew Ferry, Megan Michel, Jonas Oppenheimer, Nasreen Broomandkhoshbacht, Kristin Stewardson, Susan Walsh, Manfred Kayser, Rick Schulting, Oliver E Craig, Alison Sheridan, Mike Parker Pearson, Chris Stringer, David Reich, Mark G. Thomas, Ian Barnes

The roles of migration, admixture and acculturation in the European transition to farming have been debated for over 100 years. Genome-wide ancient DNA studies indicate predominantly Anatolian ancestry for continental Neolithic farmers, but also variable admixture with local Mesolithic hunter-gatherers. Neolithic cultures first appear in Britain c. 6000 years ago (kBP), a millennium after they appear in adjacent areas of northwestern continental Europe. However, the pattern and process of the British Neolithic transition remains unclear. We assembled genome-wide data from six Mesolithic and 67 Neolithic individuals found in Britain, dating from 10.5-4.5 kBP, a dataset that includes 22 newly reported individuals and the first genomic data from British Mesolithic hunter-gatherers. Our analyses reveals persistent genetic affinities between Mesolithic British and Western European hunter-gatherers over a period spanning Britain's separation from continental Europe. We find overwhelming support for agriculture being introduced by incoming continental farmers, with small and geographically structured levels of additional hunter-gatherer introgression. We find genetic affinity between British and Iberian Neolithic populations indicating that British Neolithic people derived much of their ancestry from Anatolian farmers who originally followed the Mediterranean route of dispersal and likely entered Britain from northwestern mainland Europe.

10: Modern human origins: multiregional evolution of autosomes and East Asia origin of Y and mtDNA
Posted 18 Jan 2017

Modern human origins: multiregional evolution of autosomes and East Asia origin of Y and mtDNA
9,877 downloads bioRxiv evolutionary biology

Dejian Yuan, Xiaoyun Lei, Yuanyuan Gui, Mingrui Wang, Ye Zhang, Zuobin Zhu, Dapeng Wang, Jun Yu, Shi Huang

The neutral theory has been used as a null model for interpreting nature and produced the Recent Out of Africa model of anatomically modern humans. Recent studies, however, have established that genetic diversities are mostly at maximum saturation levels maintained by selection, therefore challenging the explanatory power of the neutral theory and rendering the present molecular model of human origins untenable. Using improved methods and public data, we have revisited human evolution and found sharing of genetic variations among racial groups to be largely a result of parallel mutations rather than recent common ancestry and admixture as commonly assumed. We derived an age of 1.86-1.92 million years for the first split in modern human populations based on autosomal diversity data. We found evidence of modern Y and mtDNA originating in East Asia and dispersing via hybridization with archaic humans. Analyses of autosomes, Y and mtDNA all suggest that Denisovan and Neanderthal were archaic Africans with Eurasian admixtures and ancestors of South Asia Negritos and Aboriginal Australians. Verifying our model, we found more ancestry of Southern Chinese from Hunan in Africans relative to other East Asian groups examined. These results suggest multiregional evolution of autosomes and replacements of archaic Y and mtDNA by modern ones originating in East Asia, thereby leading to a coherent account of modern human origins. * AMH : anatomically modern humans MGD : maximum genetic diversity SNP : single nucleotide polymorphisms AUA : Aboriginal Australian PGD : pairwise genetic distance PCA : principal component analysis Myr : million years AFR : African ASN : East Asian EUR : European SAS : South Asian ESN : Esen in Nigeria GBR : British in England and Scotland CHS : Southern Han Chinese CHB : Han Chinese in Beijing JPT : Japanese in Tokyo BEB : Bengali from Bangladesh YRI : Yoruba in Ibadan, Nigeria CEU : Utah Residents with Northern and Western European Ancestry LWK : Luhya in Webuye, Kenya

11: rehh 2.0: a reimplementation of the R package rehh to detect positive selection from haplotype structure.
Posted 03 Aug 2016

rehh 2.0: a reimplementation of the R package rehh to detect positive selection from haplotype structure.
9,613 downloads bioRxiv evolutionary biology

Mathieu Gautier, Alexander Klassmann, Renaud Vitalis

Identifying genomic regions with unusually high local haplotype homozygosity represents a powerful strategy to characterize candidate genes responding to natural or artificial positive selection. To that end, statistics measuring the extent of haplotype homozygosity within (e.g., EHH, IHS) and between (Rsb or XP-EHH) populations have been proposed in the literature. The rehh package for R was previously developed to facilitate genome-wide scans of selection, based on the analysis of long-range haplotypes. However, its performance wasn't sufficient to cope with the growing size of available data sets. Here we propose a major upgrade of the rehh package, which includes an improved processing of the input files, a faster algorithm to enumerate haplotypes, as well as multi-threading. As illustrated with the analysis of large human haplotype data sets, these improvements decrease the computation time by more than an order of magnitude. This new version of rehh will thus allow performing iHS-, Rsb- or XP-EHH-based scans on large data sets. The package rehh 2.0 is available from the CRAN repository (http://cran.r-project.org/web/packages/rehh/index.html) together with help files and a detailed manual.

12: Ancient Genomics Reveals Four Prehistoric Migration Waves into Southeast Asia
Posted 08 Mar 2018

Ancient Genomics Reveals Four Prehistoric Migration Waves into Southeast Asia
8,801 downloads bioRxiv evolutionary biology

Hugh McColl, Fernando Racimo, Lasse Vinner, Fabrice Demeter, J. Víctor Moreno Mayar, Uffe Gram Wilken, Andaine Seguin-Orlando, Constanza de la Fuente Castro, Sally Wasef, Ana Prohaska, Ashot Margarayan, Peter de Barros Damgaard, Rasmi Shoocongdej, Viengkeo Souksavatdy, Thongsa Sayavongkhamdy, Mohd Mokhtar Saidin, Supannee Kaewsutthi, Patcharee Lertrit, Huong Mai Nguyen, Hsiao-chun Hung, Thi Minh Tran, Huu Nghia Truong, Shaiful Shahidan, Ketut Wiradnyana, Anne-Marie Bacon, Philippe Duringer, Jean-Luc Ponche, Laura Shackelford, Elise Patole-Edoumba, Anh Tuan Nguyen, Bérénice Bellina-Pryce, Jean-Christophe Galipaud, Rebecca Kinaston, Hallie Buckley, Christophe Pottier, Simon Rasmussen, Tom Higham, Robert A. Foley, Marta Mirazón Lahr, Ludovic Orlando, Martin Sikora, Charles Higham, David M. Lambert, Eske Willerslev

Two distinct population models have been put forward to explain present-day human diversity in Southeast Asia. The first model proposes long-term continuity (Regional Continuity model) while the other suggests two waves of dispersal (Two Layer model). Here, we use whole-genome capture in combination with shotgun sequencing to generate 25 ancient human genome sequences from mainland and island Southeast Asia, and directly test the two competing hypotheses. We find that early genomes from Hoabinhian hunter-gatherer contexts in Laos and Malaysia have genetic affinities with the Onge hunter-gatherers from the Andaman Islands, while Southeast Asian Neolithic farmers have a distinct East Asian genomic ancestry related to present-day Austroasiatic-speaking populations. We also identify two further migratory events, consistent with the expansion of speakers of Austronesian languages into Island Southeast Asia ca. 4 kya, and the expansion by East Asians into northern Vietnam ca. 2 kya. These findings support the Two Layer model for the early peopling of Southeast Asia and highlight the complexities of dispersal patterns from East Asia.

13: Towards a new history and geography of human genes informed by ancient DNA
Posted 21 Mar 2014

Towards a new history and geography of human genes informed by ancient DNA
7,576 downloads bioRxiv evolutionary biology

Joseph K. Pickrell, David Reich

Genetic information contains a record of the history of our species, and technological advances have transformed our ability to access this record. Many studies have used genome-wide data from populations today to learn about the peopling of the globe and subsequent adaptation to local conditions. Implicit in this research is the assumption that the geographic locations of people today are informative about the geographic locations of their ancestors in the distant past. However, it is now clear that long-range migration, admixture and population replacement have been the rule rather than the exception in human history. In light of this, we argue that it is time to critically re-evaluate current views of the peopling of the globe and the importance of natural selection in determining the geographic distribution of phenotypes. We specifically highlight the transformative potential of ancient DNA. By accessing the genetic make-up of populations living at archaeologically-known times and places, ancient DNA makes it possible to directly track migrations and responses to natural selection.

14: Going down the rabbit hole: a review on methods characterizing selection and demography in natural populations
Posted 12 May 2016

Going down the rabbit hole: a review on methods characterizing selection and demography in natural populations
7,359 downloads bioRxiv evolutionary biology

Yann X.C. Bourgeois, Khaled M Hazzouri, Ben H. Warren

1. Characterizing species history and identifying loci underlying local adaptation is crucial in functional ecology, evolutionary biology, conservation and agronomy. The ongoing and constant improvement of next-generation sequencing (NGS) techniques has facilitated the production of an ever-increasing number of genetic markers across genomes of non-model species. 2. The study of variation in these markers across natural populations has deepened the understanding of how population history and selection act on genomes. Population genomics now provides tools to better integrate selection into a historical framework, and take into account selection when reconstructing demographic history. However, this improvement has come with a burst of analytical tools that can confuse users. 3. Such confusion can limit the amount of information effectively retrieved from complex genomic datasets. In addition, the lack of a unified analytical pipeline impairs the diffusion of the most recent analytical tools into fields like conservation biology. 4. To address this need, we describe possible analytical protocols and link these with more than 70 methods dealing with genome-scale datasets. We summarise the strategies they use to infer demographic history and selection, and discuss some of their limitations. A website listing these methods is available at www.methodspopgen.com.

15: Ancient genomes from southern Africa pushes modern human divergence beyond 260,000 years ago
Posted 05 Jun 2017

Ancient genomes from southern Africa pushes modern human divergence beyond 260,000 years ago
7,345 downloads bioRxiv evolutionary biology

Carina M Schlebusch, Helena Malmström, Torsten Gunther, Per Sjödin, Alexandra Coutinho, Hanna Edlund, Arielle R Munters, Maryna Steyn, Himla Soodyall, Marlize Lombard, Mattias Jakobsson

Southern Africa is consistently placed as one of the potential regions for the evolution of Homo sapiens. To examine the region's human prehistory prior to the arrival of migrants from East and West Africa or Eurasia in the last 1,700 years, we generated and analyzed genome sequence data from seven ancient individuals from KwaZulu-Natal, South Africa. Three Stone Age hunter-gatherers date to ~2,000 years ago, and we show that they were related to current-day southern San groups such as the Karretjie People. Four Iron Age farmers (300-500 years old) have genetic signatures similar to present day Bantu-speakers. The genome sequence (13x coverage) of a juvenile boy from Ballito Bay, who lived ~2,000 years ago, demonstrates that southern African Stone Age hunter-gatherers were not impacted by recent admixture; however, we estimate that all modern-day Khoekhoe and San groups have been influenced by 9-22% genetic admixture from East African/Eurasian pastoralist groups arriving >1,000 years ago, including the Ju|'hoansi San, previously thought to have very low levels of admixture. Using traditional and new approaches, we estimate the population divergence time between the Ballito Bay boy and other groups to beyond 260,000 years ago. These estimates dramatically increases the deepest divergence amongst modern humans, coincide with the onset of the Middle Stone Age in sub-Saharan Africa, and coincide with anatomical developments of archaic humans into modern humans as represented in the local fossil record. Cumulatively, cross-disciplinary records increasingly point to southern Africa as a potential (not necessarily exclusive) 'hot spot' for the evolution of our species.

16: A Chronological Atlas of Natural Selection in the Human Genome during the Past Half-million Years
Posted 05 May 2015

A Chronological Atlas of Natural Selection in the Human Genome during the Past Half-million Years
7,187 downloads bioRxiv evolutionary biology

Hang Zhou, Sile Hu, Rostislav Matveev, Qianhui Yu, Jing Li, Philipp Khaitovich, Li Jin, Michael Lachmann, Mark Stoneking, Qiaomei Fu, Kun Tang

The spatiotemporal distribution of recent human adaptation is a long standing question. We developed a new coalescent-based method that collectively assigned human genome regions to modes of neutrality or to positive, negative, or balancing selection. Most importantly, the selection times were estimated for all positive selection signals, which ranged over the last half million years, penetrating the emergence of anatomically modern human (AMH). These selection time estimates were further supported by analyses of the genome sequences from three ancient AMHs and the Neanderthals. A series of brain function-related genes were found to carry signals of ancient selective sweeps, which may have defined the evolution of cognitive abilities either before Neanderthal divergence or during the emergence of AMH. Particularly, signals of brain evolution in AMH are strongly related to Alzheimer's disease pathways. In conclusion, this study reports a chronological atlas of natural selection in Human.

17: Recombination and lineage-specific mutations led to the emergence of SARS-CoV-2
Posted 18 Feb 2020

Recombination and lineage-specific mutations led to the emergence of SARS-CoV-2
6,847 downloads bioRxiv evolutionary biology

Juan Ángel Patiño-Galindo, Ioan Filip, Mohammed AlQuraishi, Raul Rabadan

The recent outbreak of a new coronavirus (SARS-CoV-2) in Wuhan, China, underscores the need for understanding the evolutionary processes that drive the emergence and adaptation of zoonotic viruses in humans. Here, we show that recombination in betacoronaviruses, including human-infecting viruses like SARS-CoV and MERS-CoV, frequently encompasses the Receptor Binding Domain (RBD) in the Spike gene. We find that this common process likely led to a recombination event at least 11 years ago in an ancestor of the SARS-CoV-2 involving the RBD. As a result of this recombination event, SARS-CoV and SARS-CoV-2 share a similar genotype in RBD, including two insertions (positions 432-436 and 460-472), and alleles 427N and 436Y. Both 427N and 436Y belong to a helix that interacts with the human ACE2 receptor. Ancestral state analyses revealed that SARS-CoV-2 differentiated from its most recent common ancestor with RaTG13 by accumulating a significant number of amino acid changes in the RBD. In sum, we propose a two-hit scenario in the emergence of the SARS-CoV-2 virus whereby the SARS-CoV-2 ancestors in bats first acquired genetic characteristics of SARS-CoV by incorporation of a SARS-like RBD through recombination before 2009, and subsequently, the lineage that led to SARS-CoV-2 accumulated further, unique changes specifically in the RBD.

18: The hidden elasticity of avian and mammalian genomes
Posted 16 Oct 2016

The hidden elasticity of avian and mammalian genomes
6,335 downloads bioRxiv evolutionary biology

Aurélie Kapusta, Alexander Suh, Yang Hu

Genome size in mammals and birds shows remarkably little interspecific variation compared to other taxa. Yet, genome sequencing has revealed that many mammal and bird lineages have experienced differential rates of transposable element (TE) accumulation, which would be predicted to cause substantial variation in genome size between species. Thus, we hypothesize that there has been co-variation between the amount of DNA gained by transposition and lost by deletion during mammal and avian evolution, resulting in genome size homeostasis. To test this model, we develop a computational pipeline to quantify the amount of DNA gained by TE expansion and lost by deletion over the last 100 million years (My) in the lineages of 10 species of eutherian mammals and 24 species of birds. The results reveal extensive variation in the amount of DNA gained via lineage-specific transposition, but that DNA loss counteracted this expansion to various extent across lineages. Our analysis of the rate and size spectrum of deletion events implies that DNA removal in both mammals and birds has proceeded mostly through large segmental deletions (>10 kb). These findings support a unified 'accordion' model of genome size evolution in eukaryotes whereby DNA loss counteracting TE expansion is a major determinant of genome size. Furthermore, we propose that extensive DNA loss, and not necessarily a dearth of TE activity, has been the primary force maintaining the greater genomic compaction of flying birds and bats relative to their flightless relatives.

19: Genetic landscapes reveal how human genetic diversity aligns with geography
Posted 13 Dec 2017

Genetic landscapes reveal how human genetic diversity aligns with geography
5,783 downloads bioRxiv evolutionary biology

Benjamin Peter, Desislava Petkova, John Novembre

Summarizing spatial patterns in human genetic diversity to understand population history has been a persistent goal for human geneticists. Here, we use a recently developed spatially explicit method to estimate "effective migration" surfaces to visualize how human genetic diversity is geographically structured (the EEMS method). The resulting surfaces are "rugged", which indicates the relationship between genetic and geographic distance is heterogenous and distorted as a rule. Most prominently, topographic and marine features regularly align with increased genetic differentiation (e.g. the Sahara desert, Mediterranean Sea or Himalaya at large scales; the Adriatic, inter-island straits in near Oceania at smaller scales). We also see traces of historical migrations and boundaries of language families. These results provide visualizations of human genetic diversity that reveal local patterns of differentiation in detail and emphasize that while genetic similarity generally decays with geographic distance, there have regularly been factors that subtly distort the underlying relationship across space observed today. The fine-scale population structure depicted here is relevant to understanding complex processes of human population history and may provide insights for geographic patterning in rare variants and heritable disease risk.

20: A SARS-CoV-2 vaccine candidate would likely match all currently circulating strains
Posted 27 Apr 2020

A SARS-CoV-2 vaccine candidate would likely match all currently circulating strains
5,755 downloads bioRxiv evolutionary biology

Bethany Dearlove, Eric Lewitus, Hongjun Bai, Yifan Li, Daniel B Reeves, M. Gordon Joyce, Paul T. Scott, Mihret F. Amare, Sandhya Vasan, Nelson L. Michael, Kayvon Modjarrad, Morgane Rolland

The magnitude of the COVID-19 pandemic underscores the urgency for a safe and effective vaccine. Here we analyzed SARS-CoV-2 sequence diversity across 5,700 sequences sampled since December 2019. The Spike protein, which is the target immunogen of most vaccine candidates, showed 93 sites with shared polymorphisms; only one of these mutations was found in more than 1% of currently circulating sequences. The minimal diversity found among SARS-CoV-2 sequences can be explained by drift and bottleneck events as the virus spread away from its original epicenter in Wuhan, China. Importantly, there is little evidence that the virus has adapted to its human host since December 2019. Our findings suggest that a single vaccine should be efficacious against current global strains. ### Competing Interest Statement The authors have declared no competing interest.

