Research ArticleCORONAVIRUS

Emergence of SARS-CoV-2 through recombination and strong purifying selection

See allHide authors and affiliations

Science Advances  01 Jul 2020:
Vol. 6, no. 27, eabb9153
DOI: 10.1126/sciadv.abb9153
  • Fig. 1 SARS-CoV-2 recombination with Pan_SL-CoV and Bat_SL-CoV.

    (A) SimPlot genetic similarity plot between SARS-CoV-2 Wuhan-Hu-1 and representative CoV sequences using a 400–base pair (bp) window at a 50-bp step and the Kimura two-parameter model. Phylogenetic trees of regions of disproportional similarities, showing high similarities between SARS-CoV-2 and ZXC21 (B) or GD/P1La (C), high genetic divergences of all Pan_SL-CoV sequences (D), and high similarities between GD/P1La and divergent Bat_SL-CoV sequences (E). All positions are relative to Wuhan-Hu-1. Red arrows indicate the discordant clustering relationship of SARS-CoV-2 or Pan_SL-CoV sequences with other CoV sequences. In (A), we use the ORF1a and ORF1b nomenclature consistent with the original publication from of the Wuhan virus (3); however, the National Center for Biotechnology Information (NCBI) betacoronavirus reference sequences (see SAR-CoV-2 and NC_045512.2 for an example) designate a single longer stretch called ORF1ab (from 266 to 21,555) that spans both 1a and 1b.

  • Fig. 2 Impact of SARS-CoV-2 recombination on coreceptor binding.

    (A) Amino acid sequences of the receptor binding motif (RBM) in the S gene among Sarbecovirus CoVs compared with Wuhan-Hu-1 (top). Dashes indicate identical amino acids, and dots indicate deletions. ACE2 critical contact sites are highlighted in blue, and two large deletions in green. (B) Phylogenetic tree analysis of amino acid sequences of RBM. Viruses with the ability to bind ACE2 form two distinct clusters (one including SARS_CoVs and the other including SARS_CoV-2s). Bat SL-CoVs with large deletions form another distinct cluster.

  • Fig. 3 Structure analysis of the RBM and ACE2 interface.

    (A) SARS-CoV and SARS-CoV-2 receptor binding domains (RBD). Human ACE2 in green (PDB 6M0J) at the top and the RBD of the S protein at the bottom; SARS-CoV S protein (PDB 2AJF) in red, and SARS-CoV-2 S protein (PDB 6M0J) in magenta with RBM in blue. All structure backbones are shown as ribbons with key residues at the interface shown as stick models, labeled using the same color scheme. (B) Impact of different RBM amino acids between SARS-CoV-2 RaTG13 on ACE2 binding. (C) Impact of an amino acid at position 498 (Q in SARS-CoV-2, top; H in RaTG13, bottom) on ACE2 binding. Same color coding as in (A) with additional hydrogen bonding as light-blue lines. (D) Impact of two deletions on ACE2 binding interface in some bat SL-CoVs; positions indicated in yellow, and modeled structure with long deletion between residues 473 and 486 in light blue.

  • Fig. 4 Strong purifying selection after furin cleavage in S gene among SARS-CoV-2 and closely related viruses.

    (A) Phylogenetic tree (left) and highlighter plot (right) of sequences around the RBM and furin cleavage site compared with SARS-CoV-2 Wuhan-Hu-1 [nucleic acid (na) positions 22,541 to 24,391]. ACE2 RBM and furin cleavage site highlighted in light-gray boxes. Mutations compared with Wuahn-Hu-1 are light blue for synonymous and red for nonsynonymous. Dominance of synonymous mutations within group A compared with group B highlighted on the right. Position numbers are counted in number of nucleotides (NA) from the beginning of the region. (B) Cumulative plots of each codon average behavior for all pairwise comparisons for indels, synonymous (light blue), and nonsynonymous (red) mutations, by group. The abrupt slope change of the nonsynonymous curve in group A at around codon 368 (na 1104) is indicative of a shift in localized accumulations of nonsynonymous mutations after the furin cleavage site. Group B instead lacks this abrupt change in slope at the same position. Values of ω denote average ratios of the rate of nonsynonymous substitutions per nonsynonymous site (dN/dS) for each group and region. Position numbers are counted in number of amino acids (AA) from the beginning of the region. (C) Sequence dS/dN ratios compared with Wuhan-Hu-1 within codons 1 to 368 (na 1 to 1104; green) and codons 369 to 620 (na 1105 to 1893; dark blue) in group A and group B sequences.

  • Fig. 5 Strong purifying selection on complete and partial gene regions among SARS-CoV-2, RaTG13, and Pan_SL-CoV viruses.

    Purifying selection pressure on complete and partial genes among different viruses (red boxes), as evident by shorter branches in amino acid sequence trees compared with nucleic acid sequence trees. Distinct purifying selection patterns are observed among different viruses: (A) SARS-CoV-2, RaTG13, all Pan_SL-CoV, and Bat_CoV ZXC21 and ZC45; (B) SARS-CoV-2, RaTG13, and all Pan_SL-CoV sequences; and (C) SARS-CoV-2, RaTG13, and Pan_SL-CoV_GD. Cumulative plots of the average behavior of each codon for all pairwise comparisons for synonymous mutations, nonsynonymous mutations, and indels within each gene region. ω denotes the average ratio of the rate of nonsynonymous substitutions per nonsynonymous site (dN/dS) for each group.

  • Fig. 6 Multiple recombination of SARS-CoVs with different Bat_SL-CoVs.

    (A) SimPlot genetic similarity plot between SARS-CoV GZ02 and SARS_SL-CoVs, using a 400-bp window at a 50-bp step and the Kimura two-parameter model. Group A CoVs (YN2018B, Rs9401, Rs7327, WIV16, and Rs4231) are shown in blue, group B CoVs (Rf4092, YN2013, Anlong-112, and GX2013) in orange, YNLF-34C in green, and outlier control HKU3-12 in red. Phylogenetic trees for high-similarity regions between GZ02 and YNLF-34C (B), group A (C), and group B (D). All positions are relative to Wuhan-Hu-1. Red arrows indicate the distinct clustering relationship of SARS-CoV sequences with other different CoV sequences.

Supplementary Materials

  • Supplementary Materials

    Emergence of SARS-CoV-2 through recombination and strong purifying selection

    Xiaojun Li, Elena E. Giorgi, Manukumar Honnayakanahalli Marichannegowda, Brian Foley, Chuan Xiao, Xiang-Peng Kong, Yue Chen, S. Gnanakaran, Bette Korber, Feng Gao

    Download Supplement

    This PDF file includes:

    • Figs. S1 to S10
    • Tables S1 to S3

    Files in this Data Supplement:

Stay Connected to Science Advances

Navigate This Article