Multiple integrated metabolic strategies allow foraminiferan protists to thrive in anoxic marine sediments

See allHide authors and affiliations

Science Advances  26 May 2021:
Vol. 7, no. 22, eabf1586
DOI: 10.1126/sciadv.abf1586


Oceanic deoxygenation is increasingly affecting marine ecosystems; many taxa will be severely challenged, yet certain nominally aerobic foraminifera (rhizarian protists) thrive in oxygen-depleted to anoxic, sometimes sulfidic, sediments uninhabitable to most eukaryotes. Gene expression analyses of foraminifera common to severely hypoxic or anoxic sediments identified metabolic strategies used by this abundant taxon. In field-collected and laboratory-incubated samples, foraminifera expressed denitrification genes regardless of oxygen regime with a putative nitric oxide dismutase, a characteristic enzyme of oxygenic denitrification. A pyruvate:ferredoxin oxidoreductase was highly expressed, indicating the capability for anaerobic energy generation during exposure to hypoxia and anoxia. Near-complete expression of a diatom’s plastid genome in one foraminiferal species suggests kleptoplasty or sequestration of functional plastids, conferring a metabolic advantage despite the host living far below the euphotic zone. Through a unique integration of functions largely unrecognized among “typical” eukaryotes, benthic foraminifera represent winning microeukaryotes in the face of ongoing oceanic deoxygenation.


The extent and intensity of oxygen-depleted waters are expanding in the oceans (1, 2). The biological and ecological impacts of this ongoing “deoxygenation” are not fully realized, but it is clear that organisms in affected habitats will need to adapt to these ongoing environmental changes or face extirpation or even extinction. The stress and loss of taxa in newly hypoxic or anoxic zones will alter and compromise the efficiency of local food webs and could have marked impacts on the carbon cycle and other intimately linked cycles (e.g., nitrogen and metals). At this time, it is a prerequisite to decipher biological mechanisms underlying the resilience and adaptation to deoxygenation to predict the biological responses to the ongoing oxygen loss and the ensuing impacts on oceanic biogeochemical cycling, which are paramount to predictive global climate modeling.

Foraminifera, among the most abundant and diverse protists (Rhizaria), are essential organisms within the ocean and its sediments (3), ranging from high latitude to tropical regions. Further, foraminifera can comprise most of the eukaryotic biomass in deep-sea settings. Certain species of benthic foraminifera are highly abundant in oxygen-depleted, sulfidic sediments where they can be ~2 orders of magnitude more abundant than foraminiferal assemblages of oxygenated sediments (4, 5). Some benthic foraminifers store and respire nitrate (NO3) and reportedly perform complete denitrification (6). Estimates suggest that these foraminifera account for up to 70% of total denitrification in some sediments (6, 7), making these microbial eukaryotes important players in anoxic settings and in marine biogeochemical cycling of nitrogen. Furthermore, experimental results showed that environmental hydrogen peroxide increased adenosine triphosphate concentration in some benthic foraminifera from oxygen-depleted sediments (8). Symbioses are widely reported in foraminifera inhabiting oxygen depleted to anoxic sediments, or the redoxcline, some hosting symbiotic bacteria (4, 9, 10), and others sequestering diatom chloroplasts (kleptoplasts) (1113). Furthermore, an unidentified organelle-like structure, the so-called electron opaque bodies common to most benthic foraminifera (14), can assimilate nitrate, ammonium, and, perhaps, sulfate (15, 16), thereby making foraminifera cogs in carbon (C), nitrogen (N), and sulfur (S) cycling.

Foraminifera have large repetitive genomes with hundreds of megabases, complicating attempts to produce quality genome assemblies and reliable metabolic predictions (17, 18). Key genes in the denitrification pathway were identified in foraminiferal congeners Globobulimina turgida and Globobulimina auriculata (19) and recently from metatranscriptome analyses of bulk sediment containing foraminifera (20). The phylogeny of foraminiferal denitrification genes appears to differ from the known denitrification genes in fungi, the only other eukaryote known to perform denitrification (21). However, with few draft genomes and transcriptomes, our knowledge on the potential metabolic capacity of foraminifera living in anoxic and hypoxic sediments remains largely uncharted territory.

We investigated the metabolic adaptations of two highly successful benthic foraminiferal species inhabiting hypoxic or anoxic-sulfidic sediments of the Santa Barbara Basin (SBB) (CA, USA). To identify their respective metabolic adaptations, we sequenced libraries of host-enriched mRNA (transcriptomes) and libraries of the host plus any prokaryotic associates (holobiont; metatranscriptomes) from field-collected specimens and after laboratory incubations under different geochemical conditions. The kleptoplastidic Nonionella stella Cushman & Moyer, 1930 typically occurs in anoxic and sulfidic sediments, with abundances of >200 individuals cm−3 of sediment (fig. S1A) (5). Bolivina argentea Cushman, 1926 (fig. S1B) typically inhabits hypoxic sediments with abundances of >35 individuals cm−3 of sediment (5), lacks endobionts, and has chloroplasts that are usually digested instead of sequestered (7).

Our transcriptome analyses of both foraminiferal species identified previously unknown candidate genes involved in anaerobic energy metabolism and ammonium (NH4+) assimilation or recycling. In addition, a comprehensive analysis of denitrification genes expressed in our samples provided phylogenetic inference of a putative oxygenic denitrification pathway. Further, a functional N. stella kleptoplasty was inferred by the expression of genes related to a complete diatom plastid genome, although hosts live far below the euphotic zone. This metabolic variety suggests that at least some species of this diverse protistan group will withstand severe deoxygenation and likely play major roles in oceans affected by climate change.


All the selected specimens for RNA isolation of both N. stella and B. argentea had distinctively colored cytoplasm, which is often used to establish foraminiferal viability (13, 15), within all but the final one to two test chambers at the end of each 3-day treatment. Treatments consisted of different oxygen regimes (oxic, hypoxic, anoxic, and euxinic) with or without amendment of nitrate (NO3) and hydrogen peroxide (H2O2). Concentrations of these amendments were low to undetectable at the end of incubations (table S1).

For each species, transcriptomes and metatranscriptomes were obtained and compared among different replicates from all treatments (table S2); however, the N. stella from the hypoxic treatment with both amendments had only one replicate because the RNA yield from other replicates was too low for library preparation. In both field-collected and experiment specimens, N. stella was typically encased in a “cocoon” of inorganic debris; this cocoon was much more resilient (i.e., difficult to remove) in specimens from the aerated treatment. Cocoons in specimens from other treatments were easily removed via gentle sieving and/or via a very fine sable brush. In general, the cytoplasm of N. stella from aerated treatments appeared lighter in color compared to the typical dark greenish brown of conspecifics from hypoxic, anoxic, and euxinic treatments. Generally, cytoplasm of B. argentea was lighter, tending toward yellowish-green, than that of N. stella (fig. S1, A and B).

Kleptoplast identity and functionality inferred from chloroplast-encoded and nuclear genes

To evaluate the integrity, identity, and potential function of kleptoplasts in N. stella (from a 567-m water depth), we measured its transcriptional level of chloroplast-encoded genes. Data are presented in a comparative context to B. argentea, which is known to digest chloroplasts (7). By mapping short reads from our metatranscriptomes (i.e., holobiont) to reference chloroplast genomes from the diatoms Skeletonema pseudocostatum, Odontella sinensis, Chaetoceros simplex, and Thalassiosira oceanica, results demonstrated that the kleptoplasts in N. stella are closely related to diatom plastids, with the plastid genome from S. pseudocostatum recruiting the most reads. These diatom plastid genomes were nearly completely covered by transcripts in N. stella, while only sparse, fragmented mapping was observed from transcripts of B. argentea (Fig. 1A). To further validate the expression of plastid-encoded genes, we investigated each gene’s detection, defined as the fraction of a gene’s nucleotides receiving any coverage from transcriptome reads. A total of 128 of the 139 (92%) S. pseudocostatum plastid genes had ≥50% of their length detected in at least one N. stella metatranscriptome, while 98 of 139 (71%) had ≥80% of their length detected. Essential plastid-encoded genes like ribulose bisphosphate carboxylase large and small chains (rbcL and rbcS), photosystem proteins I and II, and ribosomal genes were among those that received more than 50% detection in N. stella metatranscriptomes (Fig. 1A).

Fig. 1 Expression of plastids and plastid-related genes from N. stella and B. argentea metatranscriptomes.

(A) Mapping of the N. stella and B. argentea metatranscripts onto the chloroplast genomes of four diatoms: C. simplex, O. sinensis, T. oceanica, and S. pseudocostatum. Rings in dark purple and dark blue show the maximum detection of each gene in any sample of N. stella and B. argentea, respectively. Rings in pink and lighter blue indicate the mean of each gene’s log10-transformed coverage across all N. stella and B. argentea samples. The outermost ring displays gene category for select genes: RP, ribosomal proteins (gray); PS, photosystem proteins (orange); RuBisCo (green). (B) Expression of transcripts for plastid-encoded genes, rbcL and rbcS, and nuclear-encoded GAPDH, PGK, and FCP in N. stella and B. argentea, across the different oxygen treatments. Circle size is proportional to the abundance of the genes and displayed in counts per million (CPM) (log2). In each circle, the pie chart colors divide each gene’s total coverage according to the transcripts’ high-level taxonomic categorization.

High expression of the rbcL and rbcS genes in N. stella kleptoplasts were observed on the basis of the metatranscriptome mapping to de novo assembled genes; in contrast, low or no expression was observed in B. argentea (Fig. 1B). Such differences were not due to intersample variations, as rbcL and rbcS were consistently expressed in N. stella at levels similar to the central carbon metabolism genes such as glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and phosphoglycerate kinase (PGK), while they were undetectable or substantially less expressed than central metabolism genes in B. argentea. Key genes that function in both central carbon metabolism and the Calvin-Benson-Bassham cycle (GAPDH and PGK) were highly expressed in all transcriptomes. Among these transcripts, a small but detectable fraction of the PGK transcripts from N. stella samples were taxonomically assigned to diatoms (Fig. 1B). Furthermore, the fucoxanthin-chlorophyll binding protein (FCP), a plastid-localized chlorophyll-carotenoid protein encoded by the nuclear genome in diatoms, was expressed in metatranscriptomes from N. stella but not in B. argentea (Fig. 1B).

Assimilation of ammonium to form amino acids

Host-enriched transcriptome data revealed low or undetectable expression of eukaryotic assimilatory nitrate reductase proteins in all samples from N. stella and B. argentea, respectively. However, we found that the species use different metabolic pathways to incorporate ammonium into organic nitrogen in the form of glutamate. N. stella expresses the ammonium assimilation proteins glutamine synthetase (GS) and glutamate 2-oxo-glutarate aminotransferase (GOGAT), also known as glutamate synthase, a known route to assimilate ammonium in photosynthetic eukaryotes and fungi (Fig. 2A) (22, 23). We identified several forms of eukaryotic GS genes in both species, with the most abundant GS clusters having their best match (66 to 75% amino acid identity) to Oomycetes. N. stella also expressed the NADH (nicotinamide adenine dinucleotide phosphate)–dependent GOGAT (NADH-GOGAT), which catalyzes the reaction of glutamine with 2-oxoglutarate to form two molecules of glutamate: One serves as a substrate in the GS reaction, while the other is available for storage or metabolism (24). Phylogenetic analysis including both NADH-GOGAT and Fd-GOGAT (ferredoxin-dependent GOGAT) from diatoms, plants, fungi, and bacteria indicates that N. stella NADH-GOGAT forms an independent clade more closely related to Oomycetes than to diatoms (fig. S2). No other foraminiferan GOGAT was found in our National Center for Biotechnology Information (NCBI) database search.

Fig. 2 Key metabolic pathways and their expression levels in N. stella and B. argentea.

(A) Schematic representation of metabolic pathways in the foraminifera N. stella and B. argentea based on gene expression data from the host-enriched transcriptome. Identified proteins are displayed in grey boxes. N. stella has two proteins associated with ammonium assimilation: GS, glutamine synthetase and NADH-GOGAT, NADH-dependent glutamate synthase. B. argentea has glutamine dehydrogenase (GDH), which can be used for ammonium recycling or assimilation to glutamate. Enzymes in the denitrification pathway are NRT2; NNP, nitrate/nitrite porter; pNR; NirK, nitrite reductase; Nod, nitric oxide dismutase; NorZ, nitric oxide reductase. The identified proteins in H2-producing anaerobic energy metabolism are PFOR and Fe-hydrogenase, [FeFe]-hydrogenase. (B) Heatmap showing expression of select transcript clusters associated with functions illustrated in (A). Each cell’s color shows the average log2-tranformed CPM of each cluster (columns) in the different treatments (rows). “Amended” samples were treatments resupplied with nitrate and/or peroxide to mimic in situ conditions (see Materials and Methods); “no amendment” samples lacked replenishment.

Expression analysis of N. stella transcriptomes shows that both GS and GOGAT were expressed in all samples, including preserved shipboard samples and those incubated under different geochemical conditions, varying oxygen regimes (aerated, hypoxic, anoxic, and euxinic), and supplied with or without nitrate and hydrogen peroxide (Fig. 2B). Neither gene was significantly differentially expressed on the basis of amendment (i.e., whether nitrate and H2O2 were resupplied or not), but GS had a statistically significant differential expression by oxygen condition [adjusted (adj.) P < 0.05]. However, the effect was small in terms of log2 fold change (LFC), being within 0.25 LFC of the baseline shipboard samples (Table 1).

Table 1 N. stella and B. argentea differential expression of genes under different oxygen regimes.

The LFC and contrast measured by the adjusted P-value column is relative to the average expression measured in shipboard samples.

View this table:

In B. argentea, the expression of NADH-GOGAT was remarkably lower compared to N. stella, indicating that the GS-GOGAT pathway is unlikely operating in B. argentea (Fig. 2B). In contrast, glutamate dehydrogenase (GDH), an essential gene in glutamate metabolism (25), was consistently expressed at high levels in all B. argentea transcriptomes and metatranscriptomes compared to N. stella (Fig. 2, A and B).


In both N. stella and B. argentea transcriptomes, homologs of the nitrate transporter (NRT2), putative nitrate reductase (pNR), nitrite reductase (NirK) and nitric oxide reductase (Nor) were identified and abundantly expressed across experimental conditions and shipboard samples (Fig. 2B). Multiple nitrate transporter genes were highly expressed in both species and across all treatments, including shipboard-isolated samples. Phylogenetic analyses revealed that both species expressed at least two distinct forms of nitrate transporter (fig. S3). The first is a homolog to the eukaryotic nitrate transporter NRT2 (22). Phylogenetic analysis of our foraminiferal species and Globobulimina spp. (19) cluster together and branch within a clade including NRT2 sequences of other photosynthetic eukaryotes (fig. S3).

A second detected nitrate transporter was identified as homologous to NarK/NarU, a nitrate/nitrite transporter belonging to the nitrate/nitrite porter (NNP) family commonly found in denitrifying bacteria and archaea but few eukaryotes (26). Phylogenetic analysis placed the NNP protein sequences from both our species and Globobulimina together in one group, closely related to archaea (fig. S3).

Expression analysis showed that both NNP and NRT2 expression were consistently expressed in N. stella and B. argentea across all treatments (P > 0.05; Fig. 2B and Table 1), suggesting that NNP is likely to play a major role for transmembrane nitrate/nitrite transport for anaerobic respiration (i.e., denitrification).

We detected the expression of a gene annotated as sulfite oxidase (SO), which encodes a molybdenum cofactor binding protein. A homolog of this gene has been identified in Globobulimina spp. by Woehle et al. (19) as a pNR. In our species and Globobulimina spp., however, the SO gene lacks the flavin adenine dinucleotide (FAD)/NAD(P)H binding domains, an essential electron carrier domain for the reduction of nitrate to nitrite; thus, its role in nitrate reduction is unsupported. Alternatively, we suggest that intracellular nitrate may be reduced to nitrite by a previously unidentified pNR. Expression of this pNR was detected in both of our species; it is an SO-related enzyme containing all five expected nitrate reductase domains: the cytochrome b5-like heme domain, a central molybdenum cofactor binding domain called MOSC (molybdenum cofactor sulfurase) fused with an iron-sulfur cluster, the FAD and NADH binding oxidoreductase domains (Fig. 3A). However, this previously unidentified pNR transcript encodes subdomains in an atypical order relative to other eukaryotic nitrate reductases (Fig. 3B). This atypical organization is unlikely to be an assembly error, as the pNR was present in a single node from at least one RNAspades assembly (fig. S4; data S1). This pNR sequence was expressed in all samples (Fig. 2B) but was slightly down-regulated in all experimental treatments relative to shipboard-preserved samples (adj. P < 0.01, LFC < 1; Table 1).

Fig. 3 Phylogenetic and functional domain analysis of proteins involved in denitrification in N. stella and B. argentea.

(A) The domain structure of N. stella and B. argentea putative nitrate reductase protein as identified by Pfam domain search, with Cyt-b5, cytochrome b5-like heme/steroid binding domain (green) present in the N terminus followed by MOSC N-terminal beta barrel domain (blue), central MOSC domain, molybdenum cofactor binding domain (yellow), and then FAD and nicotinamide adenine dinucleotide (NAD) oxidoreductase binding domains in C terminus (purple and red, respectively). (B) The domain structure of canonical eukaryotic nitrate reductase protein, with Oxidored_molyb being the oxidoreductase molybdopterin binding domain (orange); Mo-co_dimer, Mo-co oxidoreductase dimerization (light blue); central Cyt-b5 (green); and then FAD and NAD oxidoreductase binding domains in C terminus (purple and red, respectively). (C) Maximum likelihood tree showing the placement of putative foraminiferal Nod-like sequences in the context of other Nor–related functions. Leaves corresponding to the putative foraminiferal sequences are bolded. Branch numbers show bootstrap support above 50. The tree is rooted with bacterial NorB. Scale bar, 0.1% sequence divergence. (D) A multiple sequence alignment showing the quinol binding sites and catalytic sites of NorZ, Nod, and the putative foraminiferal Nod-like proteins. Positions critical for the function of NorZ are highlighted in blue, and positions of conserved mutations from the canonical NorZ function are highlighted in red.

Dissimilatory NirKs, sharing 80% amino acid sequence identity to Globobulimina spp. NirK, were identified in both N. stella and B. argentea. Phylogenetic analysis showed that all foraminiferal species’ NirK transcripts clustered together in one distinct clade distant from other eukaryotes (i.e., fungi; fig. S5). There was no significant difference in NirK expression among our treatments for both species (adj. P > 0.05; Fig. 2B and Table 1).

We identified five Nor homologs, four with abundant mapping of raw reads from either N. stella or B. argentea and one lacking obvious mapping. Phylogenetic analysis showed that the Nor from both species clustered with a putative Nor in Globobulimina spp. (19), representing a cluster distinct from the conventional cytochrome c–dependent nitric oxide reductase (NorB; also called the NorB subunit of NorBC) or quinol-dependent nitric oxide reductase (qNor; also called NorZ) (Fig. 3C). The closest homolog to the foraminiferal Nor included experimentally verified genes that perform nitric oxide dismutation with nitric oxide dismutase (Nod) (27), transforming nitric oxide to both oxygen and dinitrogen, bypassing the intermediary N2O. Further, alignment of foraminiferal Nor proteins to known NorB, NorZ, and Nod references revealed that the foraminiferal Nor lacked a quinol binding site but shared conserved mutations in the catalytic site with a subclade of Nod and Nod-like proteins (Fig. 3D). Therefore, the foraminiferal Nor proteins likely function as Nod, although biochemical analyses are necessary for verification. In contrast, all our putative nitrous oxide reductase (Nos) proteins, responsible for the final step of denitrification (i.e., converting N2O to N2), were close homologs with proteins of bacterial origin (19, 20), suggesting either that Nos was acquired recently from bacteria through horizontal gene transfer or that it is not expressed by the foraminifera but by foraminiferal-associated bacteria.

Anaerobic energy metabolism

N. stella and B. argentea express characteristic proteins for anaerobic energy metabolism including pyruvate-ferredoxin oxidoreductase (PFOR) and [FeFe]-hydrogenase (2830) (Fig. 2, A and B). Domain-level analysis revealed that the N. stella PFOR is fused with a flavodoxin plus FAD and nicotinamide adenine dinucleotide (NAD) binding domains at the C terminus (Fig. 4A), a characteristic feature of PFOR in many anaerobic and facultative anaerobic protists (31). Phylogenetic analysis confirms that our PFOR sequences from both species are eukaryotic and closely related to that of the facultatively anaerobic marine polychaete Capitella teleta and the anaerobic protistan parasite Blastocystis (Fig. 4B).

Fig. 4 Phylogenetic and functional domain analysis of the anaerobic metabolic protein pyruvate ferredoxin/flavodoxin oxidoreductase.

(A) The domain structure of N. stella and B. argentea PFOR by Pfam domain search, with PFOR-N, Pyruvate ferredoxin/flavodoxin oxidoreductase thiamine diP-bdg (green); PFOR, pyruvate ferredoxin/flavodoxin oxidoreductase core II (blue); 4Fe-4S, an NADH hydrogenase (purple); TTP, thiamine pyrophosphate enzyme, C-terminal (dark red); Flavodoxin (yellow); and then FAD and NAD oxidoreductase binding domains in C terminus (orange and red, respectively). (B) Phylogenetic tree of eukaryotic and bacterial PFOR protein inferred from maximum likelihood analysis. The tree illustrates how PFOR protein sequences from N. stella and B. argentea are included in the eukaryotic clade and closely related to Blastocystis (protist) and C. teleta (metazoan). The numbers along the branches represent bootstrap values. Nodes with statistical support of >50 are shown. Tree rooted with eukaryotic mitochondrial pyruvate dehydrogenase. Scale bar, 0.8% sequence divergence.

The phylogenetic analysis of foraminiferal [FeFe]-hydrogenases showed that they branch with [FeFe]-hydrogenases from other protists such as Acanthamoeba castellanii and unclassified Thermotogales bacteria (fig. S6) but distantly related from Narf-like protein, a non–hydrogen-producing gene (32). Furthermore, analyses of functional domains within [FeFe]-hydrogenase genes showed that our foraminiferal hydrogenase architecture is most similar to the type M3 [FeFe]-hydrogenase with N-terminal NADH-quinone oxidoreductase subunit G (NuoG) consisting of FeS clusters linked to a central H-cluster catalytic domain and a C-terminal iron hydrogenase small subunit (SSU) domain (fig. S7) (32, 33). The central H-cluster domain is essential for H+ binding and is the critical distinction between known [FeFe]-hydrogenases and Narf-like proteins (33, 34). The amino acid sequence alignment of our [FeFe]-hydrogenase revealed conserved H-cluster binding motifs, confirming that these sequences appear to be bona fide [FeFe]-hydrogenases (fig. S8).

In the host dataset, PFOR and [FeFe]-hydrogenase expression was detectable in both foraminifera in all samples (Fig. 2B). Statistically significant differential expressions were seen for N. stella across oxygen regime but below an LFC of 1 (adj. P < 0.05; Table 1). N. stella expression was the highest in shipboard samples and euxinic treatments, perhaps due to its similarity to the in situ environmental milieu. On the other hand, these genes were not differentially expressed across oxygen conditions for B. argentea (adj. P > 0.05).


Both species of benthic foraminifera were resilient to all experiment regimes, spanning from well-aerated to euxinic conditions. While our experiment treatments lasted only 3 days, the prevalence of these species in their respective collection sites demonstrates their ability to thrive in redoxcline sediments. Given that the foraminiferal community, including these two species, dominates the biovolume of the SBB eukaryotic benthic ecosystem (4), it is likely that this protistan taxon will play a major role in response to expansion of low oxygen to anoxic benthic habitats.

Our metatranscriptome analysis revealed that all plastid-encoded transcripts in N. stella were derived from diatoms (Fig. 1A), likely belonging to the genus Skeletonema, a result consistent with Grzymski et al.’s (12) findings that were based on plastid 16S ribosomal RNA gene sequences. The consistently high expression of some key plastid-encoded genes like rbcL, rbcS, and photosystem proteins in N. stella confirms plastid integrity (Fig. 1B). While the kleptoplasts could perhaps assimilate inorganic carbon in situ, our specimens were preserved after brief exposure to light, so transcription of light-regulated genes may be a methodological artifact. Previous isotopic labeling analyses to demonstrate photosynthetic activity of foraminiferal kleptoplasts were inconclusive: A close relative to N. stella also collected from an aphotic habitat lacked photosynthetic activity and exhibited light-induced kleptoplast degradation (13); another study demonstrated detectable photosynthetic activity (11). Our results support transcriptional viability, suggesting the importance of kleptoplasty to the trophic strategy of SBB N. stella, which typically lacks food vacuoles with evidence of digestion products (fig. S1) and thrives at water depths exceeding 560 m, far below the euphotic zone (5).

Horizontal gene transfer may also play a role in kleptoplasty. The consistent expression of diatom-related genes encoding FCP, across all our replicates and treatments for N. stella, but not B. argentea, strongly suggests that FCP in N. stella is not an artifact or contaminant (Fig. 1B). The activity of fucoxanthin was previously reported in SBB N. stella using a different method (12); it was hypothesized then that plastid proteins and fucoxanthin remain stable for over 1 year due to slow turnover rates in anoxic environments and extremely low irradiance at >550-m depth (12, 35). As relatively few additional diatom genes were detected at similar breadth and depth of coverage, it is unlikely that the presence of undigested but active diatom nuclei could explain the presence of FCP.

Therefore, our observations point toward a horizontal transfer of the FCP gene from the nucleus of an engulfed diatom to the N. stella nucleus. Horizontal gene transfer of plastid-related genes from the nucleus of the photosynthetic eukaryote to the nucleus of the secondary eukaryotic host (eukaryote-eukaryote symbiosis) is widespread throughout diverse lineages of eukaryotes, ultimately leading to secondary and tertiary endosymbiosis events (36). Hence, there is a mounting body of evidence for kleptoplasty as a major evolutionary step toward the establishment of permanent algal endosymbionts (37). However, examples of kleptoplasty that support the evolutionary transition toward permanent endosymbiosis have been documented in species restricted to the photic zone such as in dinoflagellates that retain haptophyte chloroplasts and the symbiosis between the sea slug Elysia chlorotica and Vaucheria litorea plastids (3840). The nuclear-encoded FCP in N. stella represents new evidence supporting this hypothesis beyond the photic zone. In summary, retaining chloroplasts in a functional state and the detection of nuclear-encoded plastid-directed genes of diatom origin in N. stella further support the hypothesis that N. stella actively uses kleptoplasts to survive in the euxinic SBB seafloor, far below the euphotic zone (12).

Analyses of transcriptomes indicate that both species likely assimilate ammonium through different pathways and that neither performs nitrate assimilation (Fig. 2, A and B). These findings contradict previous expectations alluding to kleptoplasts as nitrate and ammonium assimilatory organelles in kleptoplastidic foraminifera (12, 13). Transmission electron microscopy–nanoscale secondary ion mass spectrometry (TEM-nanoSIMS) isotopic studies of other foraminiferal species demonstrated that the assimilation of both nitrate and ammonium localize to “electron-opaque bodies” rather than kleptoplasts (13, 16). In N. stella, ammonium assimilation is likely via the GS-GOGAT pathway, a route for assimilating ammonium produced from nitrate reduction in photosynthetic eukaryotes and fungi (22, 23). Our phylogenetic analyses illustrate that N. stella GOGAT belongs to NADH-GOGAT (22, 41) (fig. S2). NADH-GOGAT can also be localized to chloroplasts as in some plants (24), while in diatoms and fungi, it is localized to the mitochondria (42, 43). Thus, we expect the GS-GOGAT pathway in N. stella to occur outside kleptoplasts. The activity of GS-GOGAT is driven by the high affinity of GS for ammonium and glutamate (41, 44). As noted, because N. stella vacuoles are typically “empty” rather than filled with food particles (fig. S1, C and D), reassimilation of ammonium or urea could play a major role in sustaining N. stella’s overall nitrogen demand. In contrast, B. argentea had low-to-absent expression of GS-GOGAT but high expression of mitochondrial GDH (Fig. 2B). GDH has low affinity for ammonium and is involved with regulating the flux between the Krebs cycle and nitrogen metabolism, rather than ammonium assimilation (43). Possible roles of GDH in ammonium assimilation in heterotrophic nonsymbiotic species similar to B. argentea were predicted in a recent study (45).

The consistent expression of genes related to anaerobic respiration and energy metabolism (denitrification and pyruvate oxidation via PFOR) suggests that they are essential for sustaining the cellular activity of our two foraminiferal species regardless of oxygen concentration. Even foraminifera incubated under aerated (i.e., oxygenated) conditions expressed most of these genes as highly as they did under anoxic incubations. Our findings are in line with those of Orsi et al. (20), which demonstrated a significant increase in expression of genes involved in protein synthesis, intracellular protein trafficking, and modification of the cytoskeleton in anoxic sediments bearing foraminifera, whereas genes assigned to energy production and metabolism were consistently transcribed across different oxygen levels. Our analysis confirms that these specific pathways are consistently expressed, bolstering support for the hypothesis of their constitutive expression.

Biogeochemical studies have shown that some benthic foraminifera can perform complete denitrification (6, 7, 19, 20). Although nitrate was depleted at the end of our incubations (table S1), it is established that benthic foraminifera store nitrate and that the nitrate turnover rate is on the order of a few days (46). Denitrification in foraminifera can be mediated by endosymbiotic bacteria (9) or performed by the host (7, 19). We show that both of our foraminifera express two key denitrification proteins, NirK and Nor, and two types of nitrate/nitrite transporters (Fig. 2). Our phylogenetic and domain structure analyses showed that the denitrification proteins from our foraminifera and Globobulimina spp. (19) group together, forming a lineage distinct from bacteria, archaea, and fungi (Fig. 3C and figs. S3 and S5). This grouping suggests that a long-diverged denitrification pathway may be common to redoxcline foraminifera, with denitrification genes likely acquired from ancestral bacteria and/or archaea early in foraminiferal evolution and independently from other eukaryotes (19).

Extensive searches of our data revealed that neither species expressed the typical eukaryotic NR. The MOSC-containing protein in foraminifera differs from the previously described eukaryotic NRs (4749), yet it has all the essential domains for nitrate reduction, with domain composition similar to the NADPH nitrate reductase but with different organization (Fig. 3A). In addition, gene expression of the MOSC-containing protein was comparable to other denitrification genes. Thus, here, we describe a previously unidentified pNR homolog in foraminifera and provide its complete sequence (data S1). Future biochemical studies will be required to determine its specific role.

We detected multiple copies of the Nor gene in both species, with an expression level 10-fold higher than the expression of NirK. The high expression of Nor suggests biological relevance to these foraminifera. Phylogenetic and domain structure analysis showed that foraminiferal Nor are more closely related to Nod than to qNor and NorB (Fig. 3, C and D) (50), favoring its potential role in oxygenic denitrification. However, despite the relative similarity between our putative foraminiferal Nor and known Nod genes, there remains substantial sequence divergence. This sequence divergence—combined with overall uncertainty in the relationships between the qNor, Nor-related, and Nod genes—necessitates future dedicated studies. In particular, isotope labeling studies are needed to identify and localize the end products, whether NO is reduced to N2O, a potent greenhouse gas, or directly to N2 and O2. As previously reported in bacterial methanotrophs (27, 50), being committed to oxygenic denitrification is preferable to avoid the detrimental effects of switching electron acceptors if the respiratory chain were not simultaneously activated (51, 52). On the other hand, acetylene inhibition experiments in foraminifera argue against oxygenic denitrification (53) because acetylene blocks the conversion of N2O to N2, revealing N2O as a common intermediate within foraminiferal cells, indicative of Nor-type denitrification. Together, the genetic basis of dissimilatory nitrate reduction is highly unusual in N. stella and B. argentea, as exemplified by a domain-level rearrangement in a pNR and divergent NirK sequences.

Eukaryotes that are adapted to prolonged periods of anoxia require energy through anaerobic energy metabolism. The expression of anaerobic fumarate reduction and ethanol-producing fermentation pathways, for example, has recently been reported by Orsi et al. (20). Besides concurring with these observations, we identified a previously unrecognized anaerobic energy generation pathway in N. stella and B. argentea that involves PFOR and [FeFe]-hydrogenases (Figs. 2 and 4 and fig. S6). This metabolism has been described in eukaryotes that have hydrogenosomes and mitosomes or in anaerobic H2-producing mitochondria, currently referred to as mitochondrion-related organelles (MROs) (54). The facultative anaerobic energy metabolism and hydrogen production via PFOR and [FeFe]-hydrogenase were also associated with chloroplasts in diatoms and green algae when exposed to anoxia, a process deemed “dark fermentation” (30). Such fermentation may be assumed to occur in N. stella kleptoplasts. The fact that B. argentea performs the same fermentation but does not host intact chloroplasts suggests that the metabolism is not localized in or supported by kleptoplasts.

In both N. stella and B. argentea, the predicted domain structure of the [FeFe]-hydrogenase is most similar to those of Stygiella incarcerate (Jakobid flagellate), Naegleria gruberi (amoeba/flagellate), and Acanthamoeba, belonging to the M3-type [FeFe]-hydrogenase, which is markedly distinct from the trimeric [FeFe]-hydrogenase in ciliates. The latter has two additional FeS clusters at the C terminus, a single thioredoxin-like [2Fe2S] and [4Fe4S]-cluster motif corresponding to NuoF (fig. S7). The N. gruberi [FeFe]-hydrogenase has been experimentally shown to be active, to produce molecular hydrogen even when grown under aerobic conditions, and to localize to the cytosol, as similarly noted for Giardia and Entamoebae (55). On the basis of the similarity of [FeFe]- hydrogenases of N. stella and B. argentea to N. gruberi, we suggest with caution their ability to produce hydrogen gas. Further, the observation that both PFOR and [FeFe]-hydrogenase were expressed across all tested conditions highlights their capacity to perform anaerobic energy metabolism under conditions with different oxygen concentrations (Fig. 2B). However, the localization of both anaerobic metabolism genes whether in the mitochondria or the cytoplasm and the production of molecular hydrogen by the [FeFe]-hydrogenase remain unclear. Now, there are no data on the foraminiferal mitochondrial genome or the function of electron opaque bodies (14). However, ultrastructural studies have long shown that mitochondria are present in foraminifera, albeit sometimes with atypical structure to the classic eukaryotic mitochondria (14). Recent studies have shown that the MROs in free-living anaerobic and microaerophilic protists are not discrete categories but rather represent a functional continuum between mitochondria and mitosomes (56).

Documentation of active anaerobic metabolic pathways in our two species suggests that these benthic foraminifera have H2-producing mitochondria, which have an electron transport chain but use an electron acceptor other than oxygen (54, 57). Previous findings showing that two other foraminifera (allogromiids) survive incubation with cyanide and salicylhydroxamic acid, inhibitors of mitochondrial complex III and IV (58), further support the assertion that some foraminiferal mitochondria function anaerobically.

In summary, our gene expression analysis of two benthic foraminiferal species, thriving in hypoxic and anoxic sediments, demonstrated great metabolic versatility. These foraminifera are well poised to play major roles in response to oceanic deoxygenation, resulting in expansion of oxygen-depleted seafloor sediments. In addition, the multiple adaptive metabolic strategies identified are changing our classical view on the evolution and diversity of eukaryotes, which hypothesizes that the rise of oxygen in Earth’s system led to the acquisition of oxygen-respiring mitochondria. These foraminifera represent a model of “typical” mitochondrial-bearing eukaryotes, yet they are adapted to and can thrive in anoxic environments, often with high levels of sulfide; respire nitrate over oxygen (53); and produce energy using the same metabolism performed by anaerobic eukaryotes. Benthic foraminifera represent truly successful microbial eukaryotes with diverse and sophisticated metabolic adaptive strategies that we are just beginning to discover.


Field sampling

Samples were obtained from 3 to 5 May 2018 using the R/V Robert Gordon Sproul. Because the oxygen concentration varies considerably in the depths of the SBB [e.g., (59)], a CTDO2 cast was typically the first sampling event at each site. Bottom waters were collected from Niskin bottles deployed on the CTDO2 rosette. An Ocean Instruments (San Diego, CA) MC800 multicorer obtained sediments for geochemical sampling at fine scale (e.g., pore-water oxygen profiling and dissolved inorganic carbon). As soon as the multicorer was on deck, the multicores were taken into the environmental room, set near bottom-water temperature (7°C), and processed as required.

A Soutar box corer obtained copious amounts of foraminiferal-bearing surface sediments. Immediately after recovering the boxcorer, its lid was opened, and large-bore plastic tubing was used to siphon off most overlying water. When only a thin layer of seawater remained, a custom-made three-sided spatula was used to collect the surface ~1 to 2 cm of sediment, which was placed into a clean zipper-lock plastic bag. Sediment-laden bags were immediately taken into the darkened environmental van, where the sediment slurries were dispensed into high-density polyethylene (HDPE) plastic bottles to a depth of approximately 2 cm and then filled with bottom water so no overlying air was entrapped. Surface sediment from each boxcore was divided among approximately 20 bottles; these were maintained in the environmental room in the dark until transport to Woods Hole Oceanographic Institution (WHOI) on blue ice in coolers. At WHOI, samples were kept in the dark in coolers inside an environmental room set at 7°C until use.

For each species, sediments from one boxcore were used for each experimental treatment. For N. stella, sediments were collected on 5 May from a water depth of 567 m (34° 19.293 N, 120° 03.472 W); B. argentea were collected on 3 May from a water depth of 428.5 m (34° 18.088 N, 119° 54.557 W). Also, some foraminiferal cells from both species were picked from sieved sediment on shipboard immediately after sediment collections (details presented below).

Laboratory-based manipulative experiment

Experiment set up. The experimental design included four treatments, each in a separate chamber: aerated, hypoxic, anoxic, and euxinic (anoxic + sulfidic). Before the cruise, the hardware for the experiment was plumbed and poised. The configuration was somewhat similar to that described by others (60), where the atmosphere inside a closed humid chamber was controlled with sensors linked to controllers to maintain oxygen and/or pCO2 (ProO2 and ProCO2 systems, BioSpherix, Parish, NY, USA) using compressed gases. For treatments lacking oxygen (i.e., anoxia and euxinia), the oxygen controllers were set to 0 and controlled with 100% N2 and pCO2 was set to 400 parts per million (ppm) using 1% CO2 in N2. To produce hypoxia, a ProO2 sensor was attached to 100% N2 and the controller set at 2% atmospheric O2 (37.5 μM O2) as well as with a ProCO2 controller set to 400 ppm pCO2. Controllers are rated to ±2% after initial calibration by the manufacturer before experiment initiation. It was established previously that the seawater in food-grade plastic (HDPE) containers inside C-chambers equilibrated with the chamber atmosphere within about 40 hours (60). The aerated treatment used premixed 400 ppm CO2 in air without BioSpherix controllers.

Food-grade HDPE plastic tubs housed the foraminiferal-bearing sediments during the experiment. Two days before the experiment, agar was poured into each tub in a layer of ~4 mm. In the aerated, hypoxic, and anoxic treatments, agar was not amended. The anoxic agar was purged with N2 to remove O2. The agar of the euxinic treatment was also purged with N2 and amended with hydrogen sulfide at 50 μM final concentration. Once agar was solidified, natural seawater was added to each tub, which was placed in the appropriate incubator for equilibration for ~48 hours. Each chamber (treatment) contained 20 tubs: 12 tubs for N. stella (3 tubs for each subtreatment) and 8 tubs for B. argentea (2 tubs for each subtreatment). Tubs in each chamber, other than the aerated treatment, were subject to one of four subtreatments [amendments of H2O2 (20 μM), NO3 (40 μM), H2O2 + NO3, no amendments]. Amendments were added to some subtreatments to ensure sufficient supply to these compounds during the incubations, because evidence exists that foraminifera use these compounds. The hydrogen peroxide subtreatment incubations were included on the basis of a previous experimental study (8) that indicated significant increase in adenosine 5′-triphosphate production when N. stella and another SBB benthic foraminiferal species were incubated with 16 μM H2O2. The nitrate amendment concentration was based on historical values and bottom water values at the time of collection (~41 μM in bottom waters of the B. argentea site and ~ 24 μM at the N. stella site).

For N. stella, 40 ml of select bulk sediments was transferred from the appropriate sample bottle to each tub. Because of the relatively large size and lower abundances of B. argentea, we presorted/picked the greenish cytoplasm-bearing specimens and transferred them to treatment tubs containing 40 ml of the appropriate sediments to ensure sufficient specimens in each replicate. During incubations, hydrogen peroxide was spiked into appropriate treatments daily for a final concentration of 20 μM. To accommodate these additions, the chamber was opened for less than 1 min; chamber atmospheres readjusted within a few minutes. Each treatment incubated 3 days, ensuring sufficient numbers would remain viable for analysis. Each day of the incubations, a water sample was removed from one tub of each treatment for nitrate analysis (table S1). Before treatment terminations, another water sample was taken for H2O2 analysis (table S1).

Postincubation. After 3 days, tubs were transferred on ice to an ultrahigh-purity N2-flushed glove bag containing a stereomicroscope as appropriate to treatment. Nitrogen atmosphere was not used for processing of the aerated treatment. The euxinic treatment had considerable H2S remaining at the end of the incubation, evidenced by a noticeable smell throughout processing. The anoxic treatment for N. stella also smelled slightly of sulfide during processing, presumably because these sediments were organic-rich. Bottom waters were deoxygenated by bubbling with ultrahigh–purity nitrogen gas for about 10 min to remove oxygen, except for aerated treatments. Sediments were briefly sieved gently over a 90-μm screen with the chilled bottom waters. Live foraminifera were isolated while maintaining chilled conditions using blue ice inside the glove bag. Two glove bag–microscope set ups were used to minimize processing time.

Cell sorting, RNA extraction, and library preparation. Foraminifera were picked using a pulled Pasteur pipette; pools (n = 25) were washed three times in deoxygenated and autoclaved 0.2-μm filtered seawater. Each pool was preserved in 100 μl of DNA/RNA Shield (Zymo Research) and frozen as soon as possible at −80°C. The number of cells per replicate was predetermined on the basis of evaluation tests conducted before initiating the incubation experiments. We found that 25 cells yielded RNA of more than 10 ng/μl, an optimal concentration for both the transcriptomic and metatranscriptome kits that were used. RNA was isolated from each pool (sample) using Quick-RNA Microprep (Zymo Research). Total RNA was quantitated with a Qubit Fluorometer, RNA quality was assessed by Nanodrop, and RNA integrity was determined by Bioanalyzer (Agilent 2100) and TapeStation (Agilent 2200). A total of 137 samples including both species passed quality control and were used for RNA library preparation; the number of samples and replicates per subtreatment are shown in table S2.

We prepared two types of libraries. The mRNA polyA library was prepared with the Universal Plus mRNA-Seq Kit (NuGEN). PolyA selection was initially done to enrich for host RNA, followed by RNA fragmentation and double-stranded complementary DNA (cDNA) generation using oligo(dT) priming, followed by end repair to generate blunt ends and adaptor ligation. The mRNA samples (n = 75) were prepared using a unique 8-base barcode for each subtreatment, after which we performed strand selection based on size using AMPure XP magnetic beads (for recovery of amplicons >100 base pair) and polymerase chain reaction (PCR) amplification (PCR cycle number was determined by a quantitative PCR test run). Overall, 11 to 15 rounds of PCR amplification cycles were performed for each sample. Final libraries were quantified using both Bioanalyzer and TapeStation. A total of 75 libraries were sequenced using two complete flow cell runs on an Illumina NovaSeq platform, producing approximately 8 billion reads.

The second type of library was prepared with the Trio RNA-Seq Kit (NuGEN). Here, cDNA was generated from total RNA (host + bacteria) using random and oligo (dT) primers. Libraries were then amplified using single-primer isothermal amplification and prepared with double-stranded cDNA fragmentation, end repair to generate blunt ends, and adaptor ligation. Last, we used depletion probes custom designed by NuGEN to deplete foraminiferal ribosomal RNA. After removing the unwanted ribosomal transcript, libraries were amplified and quantified by Bioanalyzer and TapeStation. In all, 62 total RNA samples were prepared using unique eight-base barcodes and sequenced on two flow cells of Illumina NextSeq, achieving approximately 1.9 billion reads.

Data processing. Reads were quality checked using TrimGalore (, which uses the adaptor trimming tools Cutadapt ( and FastQC ( Quality-controlled fastq files of each species were concatenated into one fastq file per library preparation to perform coassembly of different samples. Because of the large number of N. stella samples in each dataset, N. stella samples were separately concatenated by oxygen regime (hypoxia, anoxia, euxinic, aerated, and shipboard). De novo transcriptome assembly was performed for each species using Trinity v2.8.4 to generate contiguous transcript sequences (contigs) (61). For N. stella, the contigs from the subsetted coassemblies were concatenated. Transcript abundance per sample was quantified with Salmon 0.12.0 (62). The abundance profiles from short-read multimapping were then used in RapClust (63) to identify and group contigs representing different fragments of the same gene. As RapClust essentially identifies different contigs with near-identical abundance patterns, this step also worked to remove unwanted duplicates and near-duplicates introduced by the multiple N. stella coassemblies.

Transcripts were then taxonomically and functionally annotated by a number of annotation tools. Coding regions were identified in the contigs with TransDecoder (61) and extracted for annotation. InterProScan, a wrapper tool that uses multiple annotation systems, was used on the nucleotide sequences to predict Pfam and TIGRFAM annotations. Coding regions were also translated to amino acid sequences with the EMBOSS tool Transeq and subsequently annotated with EggNOG-mapper v2 to provide IDs for the enzyme commission; Kyoto Encyclopedia of Genes and Genomes gene, pathway, and module; and NCBI Cluster of Orthologous Groups. Relatedness to bacterial or eukaryotic genes was identified by top hits using the EggNOG predictions and a subsequent NCBI Basic Local Alignment Search Tool (BLAST) search against NCBI’s nonredundant protein database. Conserved protein domains were identified by applying the Subfamily Protein Architecture Labeling Engine (SPARCLE) on the NCBI conserved domain database. Phylogenetic trees were reconstructed for further identifying the evolutionary relationship between key functional genes in our foraminifera and reference genes of bacterial, archaeal, and eukaryotic origins.

Identification and phylogenies of key proteins. We searched our host and holobiont transcriptomic databases for encoded amino acid sequences involved in ammonium assimilation (GS and GOGAT), denitrification (NNP, NRT2, Nor, NirK, and NosZ) and anaerobic energy metabolism (PFOR, [FeFe]-hydrogenase). Reference sequences were obtained from NCBI RefSeq nonredundant proteins database, EggNOG, Uniprot, and published genome and transcriptome of Globobulimina sp. (19). Putative Nor and NirK genes were identified on the basis of annotations using the EggNOG database. Amino acid sequences were clustered at 95% identity using CD-HIT (Cluster Database at High Identity with Tolerance) for the identification of representative sequences in each cluster. Reference proteins for the phylogenetic reconstruction of the NirK family were obtained by searching all complete bacterial, archaeal, fungal, and protozoa genomes downloaded from the NCBI RefSeq database (64) on 10 December 2019 using a hidden markov model (HMM) constructed from 41 NirK collected through manual curation of reference literature using HMMER 3.1. NorB, NorZ, and Nod reference sequences were collected on the basis of previous phylogenetic studies of these proteins (50, 65). For phylogenetic reconstruction of GOGAT, NNP, NRT2, PFOR, and [FeFe] hydrogenase, and guided with previous phylogenic studies, we searched and collected available proteins sequences from NCBI RefSeq nonredundant proteins database from representative eukaryotes, bacteria, and archaea when available. Sequence alignments were performed with MUSCLE 3.8.31. Phylogenetic reconstructions were performed with RAxML 8.2 using 100 bootstraps using the GAMMA model of rate heterogeneity with empirical base frequencies and the WAG (Whelan and Goldman) substitution model. To check for potential misassembly of nitrate reductase, we reassembled each sample with rnaSPAdes (66) and used Bandage (67) to visualize the assembly graphs of contigs corresponding to the pNR gene.

Differential gene expression analysis. To assess transcription changes under different subtreatments, abundances of individual contigs (assembled by Trinity) were mapped with the Salmon tool and combined by RapClust. Cluster abundances were normalized with TMM (trimmed mean of M values) and compared with edgeR (68) with limma (69) using voom (70) to estimate the mean/variance relationship of genes across samples. Differential expression was determined using a linear model (lmFit function in edgeR). Here, we analyzed data from samples incubated either without amendments or amended with both nitrate and hydrogen peroxide. The statistical model was set up as two separate contrasts, one to test the effect of oxygen condition, ignoring the effect of amendments, and the other to test for amendments regardless of oxygen condition. This approach was selected as the sampling reality precluded certain combinations, i.e., the aerated and shipboard sampling did not allow for nitrate or peroxide amendments. As a result, a fully crossed design was impossible, so oxygen and amendments were separately considered. The oxygen contrast used the shipboard samples as the control, i.e., differential expression is determined relative to the shipboard expression. As they were fixed immediately upon collection, the shipboard samples are as close as methodologically possible to the original source populations. A significance threshold of 0.05 was used on the adj. P values after applying a Benjamin-Hochberg correction for false discovery rate due to multiple hypothesis testing. For visualization of the expression across treatments, heatmaps were constructed using counts per million normalization of transcript counts.

Short-read mapping to chloroplast genomes. Plastid expression in N. stella and B. argentea was measured by mapping quality filtered transcriptome reads with bowtie2 onto reference plastid genomes downloaded from NCBI GenBank for S. pseudocostatum (GenBank accession no. MK372941), C. simplex (GenBank accession no. KJ958479, Thalassiosira pseudonana (GenBank accession no. NC_014808.1), and O. sinensis (GenBank accession no. Z67753.1). The read mapping produced coverage values for each reference genome by counting the number of times that a nucleotide was included in a short-read alignment to the reference. The gene-level coverage was defined as the average nucleotide’s coverage in that gene, and a nonzero coverage value indicates that the gene was detected in the transcriptome. The coverage and detection information were visualized with anvi’o (71).


Supplementary material for this article is available at

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank the captain, crew, and science parties of the R/V Robert Gordon Sproul cruises SP1703, SP1718, and SP1811; C. Cavanaugh for insightful conversations on symbiosis; and S. Bowser for comments on an earlier manuscript version. We thank two anonymous reviewers for the helpful comments on an earlier version of the manuscript. We also thank J. Couget of NuGEN for advice on Illumina RNA library preparation kits. Funding: This project was funded by the U.S. NSF IOS 1557430 and 1557566. H.L.F. acknowledges support from the Swedish Research Council VR (grant number 2017-04190). Author contributions: J.M.B., V.P.E., Y.Z., and C.M.H. conceived of the study. J.M.B., D.J.B., C.M.H., S.D.W., and H.L.F. collected the samples. F.G., J.M.B., and D.J.B. performed the experiment. F.G. processed meta/transcriptome samples. F.G., D.R.U., C.P., Y.Z., H.L.F., C.M.H., and S.D.W. analyzed the data. F.G., D.R.U., J.M.B., and Y.Z. composed the manuscript. All authors gave feedback on the manuscript. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials, as well as the following repositories: All transcriptome raw sequence and metadata from the 137 samples presented in this study are deposited at the NCBI under BioProject accession no. PRJNA714124. The raw transcriptome reads are deposited in the NCBI SRA under accession nos. SRR13960851 to SRR13960987. The final processed files for each transcriptome’s assembled sequences, predicted functions, and abundances per sample are uploaded at figshare (DOI 10.6084/m9.figshare.14183567). All codes are available at Additional data related to this paper may be requested from the authors.

Stay Connected to Science Advances

Navigate This Article