Fig. 1 FSF sharing patterns and makeup of cellular and viral proteomes. (A) Numbers in parentheses indicate the total number of proteomes that were sampled from Archaea, Bacteria, Eukarya, and viruses. (B) Barplots comparing the proteomic composition of viruses infecting the three superkingdoms. Numbers in parentheses indicate the total number of viral proteomes in each group. Numbers above bars indicate the total number of proteins in each of the three classes of proteins. VSFs are listed in Table 1. (C and D) FSF use and reuse for proteomes in each viral subgroup and in the three superkingdoms. Values given in logarithmic scale. Important outliers are labeled. Shaded regions highlight the overlap between parasitic cells and giant viruses.
Fig. 2 Spread of viral FSFs in cellular proteomes. (A) Violin plots comparing the spread (f value) of FSFs shared and not shared with viruses in archaeal, bacterial, and eukaryal proteomes. (B) Violin plots comparing the spread (f value) of FSFs shared with each viral subgroup in archaeal, bacterial, and eukaryal proteomes. Numbers on top indicate the total number of FSFs involved in each comparison. White circles in each boxplot represent group medians. Density trace is plotted symmetrically around the boxplots.
Fig. 3 Virus-host preferences and FSF distribution in viruses infecting different hosts. (A) The abundance of each viral replicon type that is capable of infecting Archaea, Bacteria, and Eukarya and major divisions in Eukarya. Virus-host information was retrieved from the National Center for Biotechnology Information Viral Genomes Project (119). Hosts were classified into Archaea, Bacteria, Protista (animal-like protists), Fungi, Plants (all plants, blue-green algae, and diatoms), Invertebrates and Plants (IP), and Metazoa (vertebrates, invertebrates, and humans). Host information was available for 3440 of the 3660 viruses that were sampled in this study. Two additional ssDNA archaeoviruses were added from the literature (129, 130). Numbers on bars indicate the total virus count in each host group. (B) Venn diagram shows the distribution of 715 (of 716) FSFs that were detected in archaeoviruses, bacterioviruses, and eukaryoviruses. Host information on the Circovirus-like genome RW_B virus encoding the “Satellite viruses” FSF (b.121.7) was not available. (C) Mean f values for FSFs corresponding to each of the seven Venn groups defined in (B) in archaeal, bacterial, and eukaryal proteomes. Values were averaged for all FSFs in each of the seven Venn groups. Text above bars indicates how many different viral subgroups encoded those FSFs.
Fig. 4 FSF distribution in the viral supergroup. (A) Total number of FSFs that were either shared or uniquely present in each viral subgroup. A seven-set Venn diagram makes explicit the 127 (27 – 1) combinations that are possible with seven groups. (B) Ariadne’s threads give the most parsimonious solution to encase all highly shared FSFs between different viral subgroups. Threads were inferred directly from the seven-set Venn diagram. FSFs identified by SCOP css. (C) Number of FSFs shared in each viral subgroup with every other subgroup. Pie charts are proportional to the size of the FSF repertoire in each viral subgroup.
Fig. 5 Phylogenomic analysis of FSF domains. (A) ToD describe the evolution of 1995 FSF domains (taxa) in 5080 proteomes (characters) (tree length = 1,882,554; retention index = 0.74; g1 = −0.18). The bar on top of ToD is a simple representation of how FSFs appeared in its branches, which correlates with their age (nd). FSFs were labeled blue for cell-only and red for those either shared with or unique to viruses. The boxplots identify the most ancient and derived Venn groups. Two major phases in the evolution of viruses are indicated in different background colors. Patterned area highlights the appearances of AV, BV, and EV soon after A, B, and E, respectively. FSFs are identified by SCOP css. (B) Viral FSFs plotted against their spread in viral proteomes (f value) and evolutionary time (nd). FSFs identified by SCOP css. (C) Distribution of ABEV FSFs in each viral subgroup along evolutionary time (nd). Numbers in parentheses indicate the total number of ABEV FSFs in each viral subgroup. White circles indicate group medians. Density trace is plotted symmetrically around the boxplots.
Fig. 6 Ancient history of RNA viral proteomes. (A) The length of Ariadne’s threads (colored lines) identifies FSFs that were shared by more than three viral subgroups. Filled circles indicate FSFs shared between two or three viral subgroups. Numbers next to each circle give the mean nd of FSFs shared by each combination. Numbers in parentheses give the range between the most ancient and the most recent FSFs that were shared by each combination. (B) Distribution of the most ancient (nd < 0.3) ABEV FSFs in evolutionary timeline (nd) for each viral subgroup. Numbers in parentheses indicate the total FSFs in each viral subgroup. White circles indicate group medians. A density trace is plotted symmetrically around the boxplots.
Fig. 7 Evolutionary relationships between cells and viruses. (A) ToP describing the evolution of 368 proteomes (taxa) that were randomly sampled from cells and viruses and were distinguished by the abundance of 442 ABEV FSFs (characters) (tree length = 45,935; retention index = 0.83; g1 = −0.31). All characters were parsimony informative. Differently colored branches represent BS support values. Major groups are identified. Viral genera names are given inside parentheses. The viral order “Megavirales” is awaiting approval by the ICTV and hence written inside quotes. Viral families that form largely unified or monophyletic groups are labeled with an asterisk. Virion morphotypes were mapped to ToP and illustrated with images from the ViralZone Web resource (131). No picture was available for Turriviridae. aActinobacteria, Bacteroidetes/Chlorobi, Chloroflexi, Cyanobacteria, Fibrobacter, Firmicutes, Planctomycetes, and Thermotogae. (B) A distance-based phylogenomic network reconstructed from the occurrence of 442 ABEV FSFs in randomly sampled 368 proteomes (uncorrected P distance; equal angle; least-squares fit = 99.46). Numbers on branches indicate BS support values. Taxa were colored for easy visualization. Important groups are labeled. bActinobacteria, Bacteroidetes/Chlorobi, Chloroflexi, Cyanobacteria, Deinococcus-Thermus, Fibrobacter, Firmicutes, and Planctomycetes. cAmoebozoa and Chromalveolata.
Fig. 8 Evolutionary history of proteomes inferred from numerical analysis. (A) Plot of the first three axes of evoPCO portrays evolutionary distances between cellular and viral proteomes. The percentage of variability explained by each coordinate is given in parentheses on each axis. The proteome of the last common ancestor of modern cells (57) was added as an additional sample to infer the direction of evolutionary splits. aIgnicoccus hospitalis, bLactobacillus delbrueckii, cCaenorhabditis elegans. (B) A distance-based NJ tree reconstructed from the occurrence of 442 ABEV FSFs in randomly sampled 368 proteomes. Each taxon was given a unique tree ID (tables S1 and S2). Taxa were colored for quick visualization.
- Table 1 VSFs and their distribution in the viral supergroup.
FSFs in boldface could be potential VSFs based on the criterion described in the text. FSFs were referenced by either SCOP ID or css. For example, the P-loop containing NTP hydrolase FSF is c.37.1, where “c” is the α/β class of secondary structure present in the protein domain, “37” is the fold, and “1” is the FSF.
SCOP ID SCOP css Venn group FSF description Distribution 69070 a.150.1 V Anti-sigma factor AsiA dsDNA 55064 d.58.27 V Translational regulator protein regA dsDNA 48493 a.120.1 V Gene 59 helicase assembly protein dsDNA 89433 b.127.1 V Baseplate structural protein gp8 dsDNA 69652 d.199.1 V DNA binding C-terminal domain of the transcription factor MotA dsDNA 56558 d.182.1 V Baseplate structural protein gp11 dsDNA 49894 b.28.1 V Baculovirus p35 protein dsDNA 160957 e.69.1 V Poly(A) polymerase catalytic subunit–like dsDNA 51289 b.85.5 V Tlp20, baculovirus telokin-like protein dsDNA 88648 b.121.6 V Group I dsDNA viruses dsDNA 161240 g.92.1 V T-antigen–specific domain–like dsDNA 118208 e.58.1 V Viral ssDNA binding protein dsDNA 54957 d.58.8 V Viral DNA binding domain dsDNA 51332 b.91.1 V E2 regulatory, transactivation domain dsDNA 56548 d.180.1 V Conserved core of transcriptional regulatory protein vp16 dsDNA 90246 h.1.24 V Head morphogenesis protein gp7 dsDNA 47724 a.54.1 V Domain of early E2A DNA binding protein, ADDBP dsDNA 57917 g.51.1 V Zn binding domains of ADDBP dsDNA 49889 b.27.1 V Soluble secreted chemokine inhibitor, VCCI dsDNA 89428 b.126.1 V Adsorption protein p2 dsDNA 82046 b.116.1 V Viral chemokine binding protein m3 dsDNA 158974 b.170.1 V WSSV envelope protein-like dsDNA 47852 a.62.1 V Hepatitis B viral capsid (hbcag) dsDNA-RT 111379 f.47.1 V VP4 membrane interaction domain dsRNA 48345 a.115.1 V A virus capsid protein alpha-helical domain dsRNA 69908 e.35.1 V Membrane penetration protein mu1 dsRNA 75347 d.13.2 V Rotavirus NSP2 fragment, C-terminal domain dsRNA 69903 e.34.1 V NSP3 homodimer dsRNA 75574 d.216.1 V Rotavirus NSP2 fragment, N-terminal domain dsRNA 58030 h.1.13 V Rotavirus nonstructural proteins dsRNA 49818 b.19.1 V Viral protein domain dsRNA, minus-ssRNA, plus-ssRNA 88650 b.121.7 V Satellite viruses ssDNA 48045 a.84.1 V Scaffolding protein gpD of bacteriophage procapsid ssDNA 50176 b.37.1 V N-terminal domains of the minor coat protein g3p ssDNA 75404 d.213.1 V VSV matrix protein Minus-ssRNA 118173 d.293.1 V Phosphoprotein M1, C-terminal domain Minus-ssRNA 69922 f.12.1 V Head and neck region of the ectodomain of NDV fusion glycoprotein Minus-ssRNA 101089 a.8.5 V Phosphoprotein XD domain Minus-ssRNA 58034 h.1.14 V Multimerization domain of the phosphoprotein from Sendai virus Minus-ssRNA 50012 b.31.1 V EV matrix protein Minus-ssRNA 48145 a.95.1 V Influenza virus matrix protein M1 Minus-ssRNA 143021 d.299.1 V Ns1 effector domain–like Minus-ssRNA 161003 e.75.1 V Flu NP-like Minus-ssRNA 160453 d.361.1 V PB2 C-terminal domain–like Minus-ssRNA 101156 a.30.3 V Nonstructural protein ns2, Nep, M1 binding domain Minus-ssRNA 160892 d.378.1 V Phosphoprotein oligomerization domain–like Minus-ssRNA 56983 f.10.1 V Viral glycoprotein, central and dimerization domains Plus-ssRNA 101257 a.190.1 V Flavivirus capsid protein C Plus-ssRNA 103145 d.255.1 V Tombusvirus P19 core protein, VP19 Plus-ssRNA 89043 a.178.1 V Soluble domain of poliovirus core protein 3a Plus-ssRNA 110304 b.148.1 V Coronavirus RNA binding domain Plus-ssRNA 101816 b.140.1 V Replicase NSP9 Plus-ssRNA 140367 a.8.9 V Coronavirus NSP7–like Plus-ssRNA 143076 d.302.1 V Coronavirus NSP8–like Plus-ssRNA 144246 g.86.1 V Coronavirus NSP10–like Plus-ssRNA 103068 d.254.1 V Nucleocapsid protein dimerization domain Plus-ssRNA 117066 b.1.24 V Accessory protein X4 (ORF8, ORF7a) Plus-ssRNA 143587 d.318.1 V SARS receptor binding domain–like Plus-ssRNA 159936 d.15.14 V NSP3A-like Plus-ssRNA 160099 d.346.1 V SARS Nsp1–like Plus-ssRNA 140506 a.30.8 V FHV B2 protein–like Plus-ssRNA 144251 g.87.1 V Viral leader polypeptide zinc finger Plus-ssRNA 141666 b.164.1 V SARS ORF9b–like Plus-ssRNA 55671 d.102.1 V Regulatory factor Nef ssRNA-RT 56502 d.172.1 V gp120 core ssRNA-RT 57647 g.34.1 V HIV-1 VPU cytoplasmic domain ssRNA-RT 49749 b.121.2 EV Group II dsDNA viruses VP dsDNA 103417 e.48.1 EV Major capsid protein VP5 dsDNA 140713 a.251.1 EV Phage replication organizer domain dsDNA 161008 e.76.1 EV Viral glycoprotein ectodomain–like dsDNA, minus-ssRNA 110132 b.147.1 EV BTV NS2-like ssRNA binding domain dsRNA 82856 e.42.1 EV L-A virus major coat protein dsRNA 140809 a.260.1 EV Rhabdovirus nucleoprotein–like Minus-ssRNA 101399 a.206.1 EV P40 nucleoprotein Minus-ssRNA 55405 d.85.1 EV RNA bacteriophage capsid protein Minus-ssRNA 68918 a.140.4 BV Recombination endonuclease VII, C-terminal and dimerization domains dsDNA 50017 b.32.1 BV gp9 dsDNA 58046 h.1.17 BV Fibritin dsDNA 56826 e.27.1 BV Upper collar protein gp10 (connector protein) dsDNA 161234 g.91.1 BV E7 C-terminal domain–like dsDNA 140919 a.263.1 BV DNA terminal protein dsDNA 89064 a.179.1 BV Replisome organizer (g39p helicase loader/inhibitor protein) dsDNA 160570 d.368.1 BV YonK-like dsDNA 51327 b.90.1 BV Head binding domain of phage P22 tailspike protein dsDNA 141658 b.163.1 BV Bacteriophage trimeric proteins domain dsDNA 64210 d.186.1 BV Head-to-tail joining protein W, gpW dsDNA 51274 b.85.2 BV Head decoration protein D (gpD, major capsid protein D) dsDNA 159865 d.186.2 BV XkdW-like dsDNA 101059 a.159.3 BV B-form DNA mimic Ocr dsDNA 58091 h.4.2 BV Clostridium neurotoxins, “coiled-coil” domain dsDNA 47681 a.49.1 BV C-terminal domain of B transposition protein dsDNA 58059 h.2.1 BV Tetramerization domain of the Mnt repressor dsDNA 54328 d.15.5 BV Staphylokinase/streptokinase dsDNA 64465 d.196.1 BV Outer capsid protein sigma 3 dsRNA 57987 h.1.4 BV Inovirus (filamentous phage) major coat protein ssDNA 160940 e.66.1 BEV Api92-like dsDNA 160459 d.362.1 BEV BLRF2-like dsDNA 109859 a.214.1 BEV NblA-like dsDNA 54334 d.15.6 BEV Superantigen toxins, C-terminal domain dsDNA 51225 b.83.1 BEV Fiber shaft of virus attachment proteins dsDNA, dsRNA 49835 b.21.1 BEV Virus attachment protein globular domain dsDNA, dsRNA 50203 b.40.2 BEV Bacterial enterotoxins dsDNA, ssDNA 111474 h.3.3 BEV Coronavirus S2 glycoprotein dsDNA, plus-ssRNA 56831 e.28.1 BEV Reovirus inner layer core protein p3 dsRNA 109801 a.30.5 AV Hypothetical protein D-63 dsDNA 161229 g.90.1 ABV E6 C-terminal domain–like dsDNA 74748 a.154.1 ABV Variable surface antigen VlsE dsDNA 143602 d.321.1 ABEV STIV B116-like dsDNA 58064 h.3.1 ABEV Influenza hemagglutinin (stalk) dsDNA, minus-ssRNA - Table 2 Significantly enriched “biological process” GO terms in (66 +43) VSFs (FDR < 0.01).
GO ID GO term Z score P FDR GO:0044415 Evasion or tolerance of host defenses 14.56 4.01 × 106 3.00 × 105 GO:0050690 Regulation of defense response to virus by virus 14.56 4.01 × 106 3.00 × 105 GO:0044068 Modulation by symbiont of host cellular process 13.8 5.72 × 106 3.00 × 105 GO:0052572 Response to host immune response 13.14 7.86 × 106 3.02 × 105 GO:0002832 Negative regulation of response to biotic stimulus 12.57 1.05 × 105 3.02 × 105 GO:0052255 Modulation by organism of defense response of other organism involved in symbiotic interaction 12.57 1.05 × 105 3.02 × 105 GO:0051805 Evasion or tolerance of immune response of other organism involved in symbiotic interaction 12.57 1.05 × 105 3.02 × 105 GO:0019048 Modulation by virus of host morphology or physiology 12.06 1.36 × 105 3.53 × 105 - Table 3 FSFs involved in capsid/coat assembly processes in viruses.
FSFs that are completely absent in cellular proteomes are presented in boldface. Several other FSFs also have negligible f values in cells.
SCOP ID SCOP css FSF description Viral lineage f-value in cells 82856 e.42.1 L-A virus major coat protein BTV-like 0.00025 56831 e.28.1 Reovirus inner layer core protein p3 BTV-like 0.00019 48345 a.115.1 A virus capsid protein alpha-helical domain BTV-like 0 56563 d.183.1 Major capsid protein gp5 HK97-like 0.2352 103417 e.48.1 Major capsid protein VP5 HK97-like 0.00006 88633 b.121.4 Positive stranded ssRNA viruses Picornavirus-like 0.00364 88645 b.121.5 ssDNA viruses Picornavirus-like 0.00099 88650 b.121.7 Satellite viruses Picornavirus-like 0 88648 b.121.6 Group I dsDNA viruses Picornavirus-like 0 49749 b.121.2 Group II dsDNA viruses VP PRD1/adenovirus-like 0.00031 47353 a.28.3 Retrovirus capsid dimerization domain–like Other/unclassified 0.00407 47943 a.73.1 Retrovirus capsid protein, N-terminal core domain Other/unclassified 0.00123 47195 a.24.5 TMV-like viral coat proteins Other/unclassified 0.00099 57987 h.1.4 Inovirus (filamentous phage) major coat protein Other/unclassified 0.00068 51274 b.85.2 Head decoration protein D (gpD, major capsid protein D) Other/unclassified 0.00049 64465 d.196.1 Outer capsid protein sigma 3 Other/unclassified 0.00006 55405 d.85.1 RNA bacteriophage capsid protein Other/unclassified 0.00006 48045 a.84.1 Scaffolding protein gpD of bacteriophage procapsid Other/unclassified 0 47852 a.62.1 Hepatitis B viral capsid (hbcag) Other/unclassified 0 101257 a.190.1 Flavivirus capsid protein C Other/unclassified 0 50176 b.37.1 N-terminal domains of the minor coat protein g3p Other/unclassified 0 103068 d.254.1 Nucleocapsid protein dimerization domain Other/unclassified 0 - Table 4 FSFs shared by different viral subgroups.
SCOP ID SCOP css FSF description Distribution 56672 e.8.1 DNA/RNA polymerases dsDNA, dsRNA, dsDNA-RT, ssRNA-RT, minus-ssRNA, plus-ssRNA 52540 c.37.1 P-loop containing nucleoside triphosphate hydrolases dsDNA, dsRNA, ssDNA, plus-ssRNA 53335 c.66.1 S-Adenosyl-l-methionine–dependent methyltransferases dsDNA, dsRNA, ssDNA, minus-ssRNA, plus-ssRNA 53098 c.55.3 Ribonuclease H–like dsDNA, ssRNA-RT, ssDNA, minus-ssRNA 88633 b.121.4 Positive stranded ssRNA viruses dsDNA, dsRNA, minus-ssRNA, plus-ssRNA 57850 g.44.1 RING/U-box dsDNA, minus-ssRNA, plus-ssRNA 51283 b.85.4 dUTPase-like dsDNA, dsDNA-RT, ssRNA-RT 56112 d.144.1 Protein kinase–like (PK-like) dsDNA, dsRNA, ssRNA-RT 54768 d.50.1 dsRNA binding domain–like dsDNA, dsRNA, plus-ssRNA 54001 d.3.1 Cysteine proteinases dsDNA, minus-ssRNA, plus-ssRNA 52266 c.23.10 SGNH hydrolase dsDNA, minus-ssRNA, plus-ssRNA 58100 h.4.4 Bacterial hemolysins dsDNA, dsRNA, ssDNA 49818 b.19.1 Viral protein domain dsRNA, minus-ssRNA, plus-ssRNA 57756 g.40.1 Retrovirus zinc finger–like domains dsDNA, dsDNA-RT, ssRNA-RT 50044 b.34.2 SH3 domain dsDNA, dsRNA, ssRNA-RT 57924 g.52.1 Inhibitor of apoptosis (IAP) repeat dsDNA, plus-ssRNA 50249 b.40.4 Nucleic acid binding proteins dsDNA, ssDNA 53041 c.53.1 Resolvase-like dsDNA, ssDNA 55550 d.93.1 SH2 domain dsDNA, ssRNA-RT 55464 d.89.1 Origin of replication binding domain, RBD-like dsDNA, ssDNA 56399 d.166.1 ADP ribosylation dsDNA, ssDNA 100920 b.130.1 Heat shock protein 70 kD (HSP70), peptide binding domain dsDNA, plus-ssRNA 47413 a.35.1 Lambda repressor–like DNA binding domains dsDNA, ssDNA 69065 a.149.1 RNase III domain–like dsDNA, plus-ssRNA 46785 a.4.5 Winged helix DNA binding domain dsDNA, ssDNA 53448 c.68.1 Nucleotide-diphospho-sugar transferases dsDNA, dsRNA 57997 h.1.5 Tropomyosin dsDNA, dsRNA 54236 d.15.1 Ubiquitin-like dsDNA, ssRNA-RT 47954 a.74.1 Cyclin-like dsDNA, ssRNA-RT 90229 g.66.1 CCCH zinc finger dsDNA, minus-ssRNA 103657 a.238.1 BAR/IMD domain–like dsDNA, ssRNA-RT 53067 c.55.1 Actin-like ATPase domain dsDNA, plus-ssRNA 47794 a.60.4 Rad51 N-terminal domain–like dsDNA, ssDNA 143990 d.336.1 YbiA-like dsDNA, plus-ssRNA 55811 d.113.1 Nudix dsDNA, dsRNA 51197 b.82.2 Clavaminate synthase–like dsDNA, plus-ssRNA 53756 c.87.1 UDP-glycosyltransferase/glycogen phosphorylase dsDNA, dsRNA 81665 f.33.1 Calcium ATPase, transmembrane domain M dsDNA, plus-ssRNA 52949 c.50.1 Macro domain–like dsDNA, plus-ssRNA 53955 d.2.1 Lysozyme-like dsDNA, dsRNA 49899 b.29.1 Concanavalin A–like lectins/glucanases dsDNA, dsRNA 48371 a.118.1 ARM repeat dsDNA, plus-ssRNA 51126 b.80.1 Pectin lyase–like dsDNA, plus-ssRNA 47598 a.43.1 Ribbon-helix-helix dsDNA, ssDNA 50494 b.47.1 Trypsin-like serine proteases dsDNA, plus-ssRNA 55144 d.61.1 LigT-like dsDNA, plus-ssRNA 81296 b.1.18 E set domains dsDNA, plus-ssRNA 161008 e.76.1 Viral glycoprotein ectodomain–like dsDNA, minus-ssRNA 90257 h.1.26 Myosin rod fragments dsDNA, dsRNA 57501 g.17.1 Cystine-knot cytokines dsDNA, ssRNA-RT 54117 d.9.1 Interleukin 8–like chemokines dsDNA, dsRNA 58069 h.3.2 Virus ectodomain ssRNA-RT, minus-ssRNA 50630 b.50.1 Acid proteases dsDNA-RT, ssRNA-RT 47459 a.38.1 HLH, helix-loop-helix DNA binding domain dsDNA, ssRNA-RT 50939 b.68.1 Sialidases dsDNA, minus-ssRNA 55166 d.65.1 Hedgehog/DD peptidase dsDNA, ssDNA 51225 b.83.1 Fiber shaft of virus attachment proteins dsDNA, dsRNA 49835 b.21.1 Virus attachment protein globular domain dsDNA, dsRNA 111474 h.3.3 Coronavirus S2 glycoprotein dsDNA, plus-ssRNA 55658 d.100.1 L9 N-domain–like dsDNA, dsDNA-RT 55895 d.124.1 Ribonuclease Rh–like dsDNA, plus-ssRNA 52972 c.51.4 ITPase-like dsDNA, plus-ssRNA 57959 h.1.3 Leucine zipper domain dsDNA, ssRNA-RT 50203 b.40.2 Bacterial enterotoxins dsDNA, ssDNA 48208 a.102.1 Six-hairpin glycosidases dsDNA, ssDNA 50022 b.33.1 ISP domain dsDNA, ssRNA-RT 58064 h.3.1 Influenza hemagglutinin (stalk) dsDNA, minus-ssRNA
Supplementary Materials
Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/1/8/e1500527/DC1
Text S1. Phylogenetic assumptions and models.
Fig. S1. FSF use and reuse for proteomes in each viral subgroup and for free-living cellular organisms.
Fig. S2. Distribution of FSFs in each of the seven Venn groups defined in Fig. 3B along the evolutionary timeline (nd).
Fig. S3. Spread of abe core FSFs in viral subgroups.
Fig. S4. Evolutionary relationships within the viral subgroup.
Fig. S5. Evolutionary relationships between cells and viruses.
Table S1. List of viruses sampled in this study.
Table S2. List of cellular organisms sampled in this study.
Table S3. VSFs and their spread in cellular (X) proteomes.
Table S4. FSF use and reuse values for all proteomes.
Table S5. List of FSFs corresponding to each of the seven Venn groups defined in Fig. 3B.
Table S6. FSFs mapped to structure-based viral lineages.
Table S7. Significantly enriched “biological process” GO terms in EV FSFs (FDR < 0.01).
References (132–137)
Additional Files
Supplementary Materials
This PDF file includes:
- Text S1. Phylogenetic assumptions and models.
- Fig. S1. FSF use and reuse for proteomes in each viral subgroup and for free-living cellular organisms.
- Fig. S2. Distribution of FSFs in each of the seven Venn groups defined in Fig. 3B along the evolutionary timeline (nd).
- Fig. S3. Spread of abe core FSFs in viral subgroups.
- Fig. S4. Evolutionary relationships within the viral subgroup.
- Fig. S5. Evolutionary relationships between cells and viruses.
- Legends for tables S1 to S7
- References (132–137)
Other Supplementary Material for this manuscript includes the following:
- Table S1 (Microsoft Excel format). List of viruses sampled in this study.
- Table S2 (Microsoft Excel format). List of cellular organisms sampled in this study.
- Table S3 (Microsoft Excel format). VSFs and their spread in cellular (X) proteomes.
- Table S4 (Microsoft Excel format). FSF use and reuse values for all proteomes.
- Table S5 (Microsoft Excel format). List of FSFs corresponding to each of the seven Venn groups defined in Fig. 3B.
- Table S6 (Microsoft Excel format). FSFs mapped to structure-based viral lineages.
- Table S7 (Microsoft Excel format). Significantly enriched “biological process” GO terms in EV FSFs (FDR < 0.01).
Files in this Data Supplement: