Research ArticleSYSTEMS BIOLOGY

Universal scaling across biochemical networks on Earth

See allHide authors and affiliations

Science Advances  16 Jan 2019:
Vol. 5, no. 1, eaau0149
DOI: 10.1126/sciadv.aau0149
  • Fig. 1 Enzyme diversity of ecosystems across Earth.

    Shown is the geographical distribution of the 5587 ecosystems in our study, colored by the number of different enzyme functional classes [enzyme commission (EC) numbers] encoded in sampled metagenomes (data from JGI). Despite large variances in the enzyme diversity and what enzymes are present in each ecosystem, all ecosystems sampled are found to conform to the same scaling behavior for biochemical diversity and topology as a function of biochemical network size (see Fig. 3).

  • Fig. 2 Three alternative scenarios for how biochemical network structure might be similar or dissimilar across levels of organization.

    For each scenario, illustrative plots show examples of scaling behavior of some network property as function of network size, where each data point corresponds to the measure for a single instance of a network. In the first (A), biochemistry does not exhibit common network structure across levels, and different properties emerge at different levels. In the second (B), biochemistry has a common network structure across all levels, but this structure is also shared by random chemical networks. In the final scenario (C), biochemistry has shared structure across all levels, which is different from that of random chemical networks. Our results are consistent with this third scenario, indicative of universal organizing principles recurring across biological levels, which are unique to biology (not shared by random chemistry), which we show arises due to the network structure of common reactions shared across life on Earth.

  • Fig. 3 Common scaling laws describe biochemical networks across levels of organization.

    Scaling of biochemical measures for individuals (left column) and ecosystems (right column) shared the same functional form for biochemical diversity (enzyme and reaction diversity) and for topological measures. Shown from top to bottom are the following: (A) number of reactions (NR) and number of enzyme classes (NEC). (B) Average shortest path (<l>) and average clustering coefficient (<C>). All measures are as a function of the size of the LCC (NCompounds). Ecosystems include metagenomes (red) and the biosphere-level network (Earth icon). Fits for each dataset (solid lines) are shown with 95% confidence intervals (dashed lines). For reference, shown in light gray are data for all biochemical networks (individuals, ecosystems, and biosphere). Additional measures are shown in fig. S3, and scaling for bipartite networks is shown in fig. S4.

  • Fig. 4 Scaling laws distinguish biochemical networks from random networks across levels of organization.

    Shown are random reaction networks created by sampling biochemical reactions from a flat distribution (left column), frequency-sampled random reaction networks created by sampling reactions based on the frequency distribution observed across all organisms (center column), and random genome networks (right column). Merged networks composed of individuals include bacteria only (light blue), archaea only (dark blue), eukarya only (blue-green), and all domains combined (purple). (A) Scaling of biochemical diversity. Diversity measures and fit are as described in Fig. 3. For reference, all real biochemical network data from Fig. 3 are shown in light gray. Additional measures are shown in fig. S5. (B) Scaling of network structure. Measure and fit descriptions match those described in Fig. 3. For reference, all real biological networks from Fig. 3 are shown in light gray. Additional measures are shown in fig. S5, and scaling for bipartite networks is shown in fig. S6. We found that random reaction networks do not recover the same fit functions as real biological networks for assortativity and clustering, whereas frequency-sampled random reaction networks and random genome networks only differed for assortativity, but nonetheless were statistically distinguishable from real biochemical networks for some measures.

  • Fig. 5 Scaling laws for individuals and ecosystems are statistically distinguishable for some network and catalytic diversity measures.

    Shown are the results of a permutation test to determine whether properties of biochemical networks constructed from individual genomes scale differently than those constructed from metagenomes (ecosystems). For each network measure, the test statistic is shown as a vertical dashed line, while the null distribution is shown as a solid line (see “Fitting network measure scaling and permutation tests” section in Materials and Methods on for more details). Blue squares indicate that the scaling behavior is indistinguishable between levels of organization, while green squares show measures that can distinguish scaling of individuals from that of ecosystems.

  • Fig. 6 The biosphere-level chemical reaction network.

    The biosphere-level network was constructed from the union of all 22,559 genomic networks in our study. Each panel shows the same biosphere-level network, with nodes (representing compounds) in white and edges (representing their connections) in gray. Node size indicates degree within the network. Colors indicate biochemical compounds used in (A) all three domains of life (yellow), (B) in archaea only (pink), (C) in eukarya only (green), and (D) in bacteria only (blue). Although many more chemical compounds are shared across all three domains than are unique to each, the organization of these compounds into biochemical networks was distinct for each domain based on statistical testing, which shows (E) that catalytic diversity and biochemical network topology can predict evolutionary domain. Shown is the estimated prediction accuracy (y axes) for each measure and each domain. The colors of each bar indicate prediction accuracy of a given measure for a particular domain: Red is comparable to random guessing (y ≤ 33% accuracy); yellow is better than random but not completely predictive (33% < y ≤ 67%); green is predictive of domain (67% < y). The horizontal line indicates 80% prediction accuracy.

Supplementary Materials

  • Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/5/1/eaau0149/DC1

    Section S1. Network representations of catalyzed biochemical reaction

    Section S2. Topological measures

    Fig. S1. Percentage of nodes in the LCC of a network versus the size of its LCC.

    Fig. S2. Reaction knockout for unipartite networks.

    Fig. S3. Additional network measures for individuals and ecosystems show universal scaling across levels.

    Fig. S4. Scaling of bipartite network structure for individuals and ecosystems.

    Fig. S5. Additional network measures for randomly sampled individuals and randomly sampled reactions.

    Fig. S6. Scaling of bipartite network structure for randomly sampled individuals and randomly sampled reactions.

    Fig. S7. Distributions of network sizes for each domain and across levels of organization.

    Fig. S8. Biochemical diversity and network topology measures for parsed datasets.

    Fig. S9. Biochemical diversity and network topology measures for domain-weighted frequency-sampled random reaction networks.

    Table S1. Percentage of networks in each dataset with x% of nodes in the LCC.

    Table S2. Distinguishability of individuals and ecosystems, and ecosystems and random genome networks.

    Data file S1. Scaling parameters for topological measures with 95% confidence intervals.

    Data file S2A. Summary of measured network properties, by domain.

    Data file S2B. Summary of measured network properties, by levels (parsed data only).

    Data file S2C. Summary of measured network properties, by levels (parsed data excluded).

    References (7375)

  • Supplementary Materials

    The PDF file includes:

    • Section S1. Network representations of catalyzed biochemical reaction
    • Section S2. Topological measures
    • Fig. S1. Percentage of nodes in the LCC of a network versus the size of its LCC.
    • Fig. S2. Reaction knockout for unipartite networks.
    • Fig. S3. Additional network measures for individuals and ecosystems show universal scaling across levels.
    • Fig. S4. Scaling of bipartite network structure for individuals and ecosystems.
    • Fig. S5. Additional network measures for randomly sampled individuals and randomly sampled reactions.
    • Fig. S6. Scaling of bipartite network structure for randomly sampled individuals and randomly sampled reactions.
    • Fig. S7. Distributions of network sizes for each domain and across levels of organization.
    • Fig. S8. Biochemical diversity and network topology measures for parsed datasets.
    • Fig. S9. Biochemical diversity and network topology measures for domain-weighted frequency-sampled random reaction networks.
    • Table S1. Percentage of networks in each dataset with x% of nodes in the LCC.
    • Table S2. Distinguishability of individuals and ecosystems, and ecosystems and random genome networks.
    • Legends for data files S1 to S2C
    • References (7375)

    Download PDF

    Other Supplementary Material for this manuscript includes the following:

    • Data file S1 (.csv format). Scaling parameters for topological measures with 95% confidence intervals.
    • Data file S2A (.csv format). Summary of measured network properties, by domain.
    • Data file S2B (.csv format). Summary of measured network properties, by levels (parsed data only).
    • Data file S2C (.csv format). Summary of measured network properties, by levels (parsed data excluded).

    Files in this Data Supplement:

Navigate This Article