Research ArticleIMMUNOLOGY

Shared transcriptional profiles of atypical B cells suggest common drivers of expansion and function in malaria, HIV, and autoimmunity

See allHide authors and affiliations

Science Advances  26 May 2021:
Vol. 7, no. 22, eabg8384
DOI: 10.1126/sciadv.abg8384


Chronic infectious diseases have a substantial impact on the human B cell compartment including a notable expansion of B cells here termed atypical B cells (ABCs). Using unbiased single-cell RNA sequencing (scRNA-seq), we uncovered and characterized heterogeneities in naïve B cell, classical memory B cells, and ABC subsets. We showed remarkably similar transcriptional profiles for ABC clusters in malaria, HIV, and autoimmune diseases and demonstrated that interferon-γ drove the expansion of ABCs in malaria. These observations suggest that ABCs represent a separate B cell lineage with a common inducer that further diversifies and acquires disease-specific characteristics and functions. In malaria, we identified ABC subsets based on isotype expression that differed in expansion in African children and in B cell receptor repertoire characteristics. Of particular interest, IgD+IgMlo and IgDIgG+ ABCs acquired a high antigen affinity threshold for activation, suggesting that ABCs may limit autoimmune responses to low-affinity self-antigens in chronic malaria.


For most acute infectious diseases in humans, a single exposure results in life-long protective antibody-dependent immunity through the acquisition of long-lived plasma cells (PCs) and memory B cells (MBCs) (1). In contrast, some pathogens, including the malaria-causing parasite, Plasmodium falciparum, and the AIDS-causing virus, HIV, fail to induce immune responses that promote the efficient development of long-lived humoral immunity (2, 3). Rather, these infections are chronic as in the case of HIV or for malaria involving frequent reinfection in areas of high malaria transmission, as immunity is rarely established to parasite liver infections. Evidence is mounting that chronic infections have a significant impact on the human B cell compartment that may strongly influence many aspects of the outcomes of these deadly diseases. However, at present, we do not have a comprehensive view of the phenotypic or functional heterogeneity within the human B cell compartment in healthy individuals or of the full impact of malaria or HIV on this compartment.

Malaria is a deadly infectious disease claiming nearly 450,000 lives yearly in Africa alone, mostly among young children under the age of five (4). P. falciparum–specific antibodies play a central role in malaria immunity (5). However, the acquisition of long-lived protective antibodies in children born in malaria-endemic Africa is extraordinarily slow, requiring years of repeated parasite exposure, leaving these children susceptible to severe malaria and death (6). Evidence is accumulating that in addition to the slow acquisition of antibody immunity in malaria, malaria also alters the composition of the B cell compartment. Of particular interest is a large expansion of an unusual B cell subpopulation in individuals living in malaria-endemic Africa termed as atypical B cells (ABCs), initially referred to as atypical MBCs (7). This B cell population expressed high levels of a variety of inhibitory receptors, appeared to be transcriptionally distinct from naïve B cells and conventional MBCs by microarray analysis, and was initially characterized as hyporesponsive to B cell receptor (BCR) cross-linking by soluble anti–Ig (immunoglobulin) (8). However, recent studies showed ABCs were fully responsive to membrane-associated antigens through mechanisms that resulted in the exclusion of inhibitory receptors from the BCR immune synapse (9). Although these studies provided evidence that ABCs in malaria could respond to antigens, the ultimate function of that response remains to be determined.

Malaria may also affect the B cell compartment by shaping the B cell repertoire, although this possibility remains to be explored in depth. Our earlier analyses of the V gene repertoires of naïve B cells, ABCs, and MBCs found that the variable region of immunoglobulin heavy chain (VH) and variable region of immunoglobulin light chain (VL) repertoires of classical MBCs and ABCs were remarkably similar in V gene usage, rate of somatic hypermutation (SHM), and VH CDR3 length and composition (10). Using an accurate and high-coverage Ig sequencing method, we found unexpected high levels of SHM in infants as young as 3 months (11). Antibody lineage analysis showed that SHM also increased in both infants and young children upon febrile malaria. We also showed that levels of the inherently autoreactive VH4-34 IgG were significantly increased in children with acute febrile malaria but not in malaria-immune individuals (12), suggesting a dynamic relationship between autoreactivity and the acquisition of immunity to malaria. Clearly, it will be important to understand how malaria affects the B cell repertoire and whether the ABC repertoire that expands in malaria shows signs of antigen-driven selection.

It is of interest that B cell subpopulations with characteristics similar to, but not identical to, those of ABCs in malaria have been shown to expand in other chronic infectious diseases (2), perhaps best studied in high viremic HIV infections (13), as well as in autoimmune diseases, notably systemic lupus erythematosus (SLE) (14). The similarities of ABCs in different diseases are based on a relatively small number of markers, and to our knowledge, RNA sequencing (RNA-seq) data have not been compared between ABC populations. Nevertheless, these observed similarities suggest the intriguing possibility that ABCs may represent a separate lineage with a common inducer in different chronic infections and possibly in autoimmunity. However, given the distinct clinical pictures in different chronic diseases, B cells that arise in such a lineage would be predicted to acquire distinct functions and antigen-driven repertoires depending on the disease environment.

To better understand the impact of chronic infection on the B cell compartment, we performed in-depth transcriptional analyses of B cells obtained from the peripheral blood of adults and children living in malaria-endemic Africa, HIV-infected individuals, and healthy U.S. adults. We first carried out bulk RNA-seq to identify unique transcriptional profiles of naïve B cells, ABCs, and classical and activated MBCs sorted using standard markers and, in so doing, validated a previously unknown set of cell surface markers, cytokines, and transcription factors (TFs) associated with the ABC subset. Subsequent unbiased single-cell RNA-seq (scRNA-seq) of total B cells revealed greater heterogeneity within B cell subsets of individuals chronically exposed to P. falciparum than was previously appreciated. We observed distinct activation states in naïve and classical MBCs, as well as an expansion of ABCs in malaria-exposed individuals that were not present in unexposed donors. These ABCs formed a transcriptionally distinct cluster, and trajectory analysis revealed differentiation from naïve B cells and modulation by interferon-γ (IFN-γ) signaling. We found that ABC clusters in malaria showed significant similarities to ABCs in HIV and in autoimmune diseases. This unexpected observation provided support for the hypothesis that ABCs represent a separate lineage with a common initiator in different chronic diseases. Focusing on ABCs in malaria, we identified novel heterogeneity in isotype expression and in expression of the ABC markers, Tbet and CD11c, that defined three subpopulations. These subpopulations differed in their response to acute febrile malaria in children, showed different levels of SHM in their VH repertoires, and had distinct BCR characteristics that indicated that the subpopulations arose under antigen-driven pressure. Of particular interest, IgD+IgMlo and IgDIgG+ ABCs acquired an unusually high antigen affinity threshold for activation, suggesting that they may serve to limit responses to low-affinity self-antigens in chronic malaria.


ABCs, naïve B cells, and classical MBCs in chronic malaria express distinct gene signatures

To better understand the impact of chronic malaria on the B cell compartment, we carried out bulk RNA-seq on B cell subsets obtained from three adults living in malaria-endemic Mali. To do so, purified mature peripheral blood B cells (CD19+CD20+CD10) were sorted into four subsets on the basis of their expression of CD21 and CD27 as previously described (8): naïve B cells (CD21+CD27), classical MBCs (CD21+CD27+), activated MBCs (CD21CD27+), and ABCs (CD21CD27) (fig. S1A). Unique gene signatures were identified on the basis of comparisons of differentially expressed genes (DEGs) between subsets (table S2 and Materials and Methods) and comparison between gene signatures of each subset (Fig. 1 and fig. S1B). Focusing on ABCs, we determined that the transcriptional signature contained 176 DEGs, 112 of which were up-regulated and 64 were down-regulated and encoded proteins that fell into a variety of functional subsets including cell surface markers, cytokines, and TFs, as well as proteins involved in adhesion and motility, cytoskeleton regulation, signaling, and metabolism (Fig. 1A). The largest number of DEGs encoded TFs and cell surface proteins (Fig. 1A), suggesting that ABCs may have a unique transcriptional program and distinct functions mediated through cell surface receptors. The DEGs in the ABC gene signature encoding cell surface markers, TFs and cytokines, and signaling molecules are displayed as heatmaps (Fig. 1B). Within the cell surface markers, our ABC DEGs corroborated previously identified ABC markers of chronic malaria–exposed individuals including FCRL5, LILRB2 (CD85D), ITGAX (CD11c), and CD72 (2, 7, 8, 15). We validated by flow cytometry the cell surface expression of CD72 and five DEGs encoding novel cell surface–expressed proteins of interest as possible regulators of B cell responses, namely, CD79A, ITGB2 (integrin β-2), CD55 (which was down-regulated in ABCs), CD63, and CD84 (Fig. 1C; quantified from multiple Malian donors in fig. S2A).

Fig. 1 Atypical B cells express a distinct transcription signature including genes encoding cell surface markers, TFs, and signaling molecules.

(A) Distribution of up-regulated and down-regulated DEGs in sorted atypical B cells (ABC gene signature) compared to sorted naïve, classical MBCs (cMBCs), and activated MBC subsets (acMBCs) by category. (B) The DEGs in the ABC gene signature are shown in heatmaps comparing genes encoding cell surface markers, TFs, cytokines, and components of signaling pathways. Each column represents one donor of three (subject IDs: 730, 731, and 736; table S1). Donor 3 had fewer than 500 acMBCs, resulting in a low gene count, and was removed from the acMBC analysis. Genes that were previously shown to be associated with ABCs or were validated in this study are indicated by arrows. Color scale indicates relative gene expression in each subset compared to the other three subsets. (C) The expression levels of the indicated gene products quantified by flow cytometry by either cell surface or intracellular staining of naïve B cells (gray), cMBCs (black), and ABCs (red) from one Malian adult donor (subject ID: 704; table S1). (D) Left: Relative expression level of IL-10 mRNA assessed in naïve B cells, cMBCs, and ABCs from three Malian donors (subject IDs: 581, 583, and 700; table S1) by quantitative reverse transcription polymerase chain reaction (qRT-PCR) and depicted as log2 fold changes for the indicated comparisons. Right: Representative histogram of surface expression of IL-10R in the indicated B cell subsets from one Malian donor (subject ID: 706; table S1). (E) Relative expression levels of NR4A2, TOX2, and TCF7 mRNA was assessed in naïve B cells, cMBCs, and ABCs from three Malian donors by qRT-PCR (subject IDs: 581, 583, and 700; table S1).

CD79a was shown to be highly expressed on the surface of ABCs (Fig. 1C) as compared to naïve B cells and classical MBCs. The BCR is composed of a membrane form of Ig (mIg) heavy chain (IgH) associated with either Igλ or κ light chains to form the mIg. Two transmembrane signaling proteins CD79a and CD79b associate with the mIg to form a complex that consists of mIg and one chain each of CD79a and CD79b (1mIg:1CD79a:1CD79b) (16). In ABCs, Igλ and Igκ were expressed at levels that resulted in a mIg:CD79a ratio of approximately 1:2 (fig. S2B). The unusual 1:2 ratio suggests either that the BCR structure is unusual in ABCs containing only CD79a chains or that the BCR contains both CD79a and CD79b chains and additional CD79a chains are associated with cell surface receptors other than mIg. We also identified several additional novel cell surface markers that were up-regulated in ABCs. These included the integrin β chain that plays a role in cell-cell adhesion and regulates signaling via cell surface receptors (17): CD72, a B cell surface receptor that negatively regulates B cell activation (18); CD63, a member of the tetraspanin family that mediates signal transduction events on B cell surfaces (19); and CD84, a member of the signaling lymphocyte activation molecule (SLAM) family that regulates receptor-mediated signaling (20). CD55, also known as complement decay-accelerating factor, that regulates the complement cascade on B cell surfaces was down-regulated in ABCs (Fig. 1C and fig. S2A) (21). We also validated by flow cytometry an increase in the intracellular expression of Ezrin in ABCs as compared to naïve B cells and classic MBCs (Fig. 1C and fig. S2A). Ezrin plays an important role in regulating BCR signaling through changes in the cytoskeleton and germinal center (GC) B cells that exhibit high antigen affinity thresholds for activation engage antigens through novel pod-like plasma membrane structures that are highly enriched in Ezrin (22).

Concerning cytokine DEGs in the ABC gene signature, pathway analysis suggested enrichment of genes in the interleukin-10 (IL-10) anti-inflammatory signaling pathway in ABCs. We consequently validated that ABCs expressed more IL-10 as compared to naïve B cells and classical MBCs by quantitative reverse transcription polymerase chain reaction (qRT-PCR) and higher IL-10 receptor encoded by IL10R by flow cytometry (Fig. 1D). This finding suggests that ABCs may play a role in controlling inflammatory responses during febrile malaria.

Concerning DEGs encoding TFs, ABCs are defined, in part, by their expression of Tbet, TBX21 (23), which was up-regulated in the ABC gene signature (Fig. 1B). In addition to Tbet, our analysis identified four new differentially expressed TFs in ABCs, namely, ZBTB32, NR4A2, TOX2, and TCF7. We validated by flow cytometry the up-regulation of ZBTB32 that plays a role in regulating MBC responses to immunization with protein antigens (Fig. 1C and fig. S2A). It was recently shown in mouse models that ZBTB32 can restrict antibody levels in chronic viral infections but not in acute infection (24). Up-regulation of NR4A2 and TOX2 and down-regulation of TCF7 (25) are associated with T cell exhaustion, and these were validated for ABCs by qRT-PCR (Fig. 1E). However, three key cell surface markers of T cell exhaustion—namely, PD1, TIM3, and LAG3 (26)—were not differentially expressed on the surface of ABCs (fig. S2C). Although ABCs appear to have some features associated with T cell exhaustion, our recent studies provided evidence that they are not functionally exhausted (9). We showed that although ABCs responded poorly to BCR cross-linking in solution, ABCs were fully responsive to membrane-associated antigen-mimicking antigen presentation by follicular dendritic cells.

Together, the RNA-seq of the four major B cell subsets in chronic malaria provided distinct gene signatures characterizing each subset. Of particular interest was the identification of a variety of new DEGs in ABCs pointing to their unique transcriptional state in chronic infections.

scRNA-seq reveals heterogeneity in naïve B cell and MBC subsets in malaria

To further dissect the transcriptional changes in the B cell subpopulations, we carried out scRNA-seq of B cells purified by negative selection from the peripheral blood of three malaria-exposed Malian adults and three unexposed, healthy U.S. adults. The scRNA-seq data for both Malaria-exposed and healthy samples are shown as a t-distributed stochastic neighbor embedding (tSNE) plot after data integration and batch correction (Materials and Methods and Fig. 2A, I) and as separate tSNE plots for samples from malaria-exposed donors (Fig. 2A, II) and healthy donors (Fig. 2A, III). Nine clusters were identified (0 to 8). Clusters 6, 7, and 8 contained very few cells (163) that were likely non–B cell contaminants and were consequently excluded from further analysis. Four clusters, 0 to 3, contained B cells from both malaria-exposed and healthy donors (approximately 33 to 53% from malaria-exposed donors) (Fig. 2B). In contrast, clusters 4 and 5 were highly enriched in B cells from malaria-exposed donors (81 and 80%, respectively) (Fig. 2B).

Fig. 2 scRNA-seq analysis of B cell from malaria-exposed donors reveals heterogeneity in B cell subsets.

(A) scRNA-seq data from peripheral blood B cells from three malaria-exposed Malian adults (subject IDs: 729, 732, and 745; table S1) and three healthy U.S. adults (subject IDs: US_1, US_2, and US_3; table S1) were integrated with batch correction for a total of 4453 cells (shown as individual dots) and displayed by tSNE. (I) Combined data from Malian and U.S. donors. Data from (I) separated into Malian donors (II) (1968 cells) and U.S. donors (III) (2485 cells). (B) Total number of cells per cluster and the percentage of cells in each cluster from Malian versus U.S. donors for clusters 0 to 5. (C) tSNE plots of 4290 cells combined from Malian and U.S. donors after removing 163 cells in clusters 6, 7, and 8 onto which the following gene signatures were mapped and shown as contour maps, reflecting the region of cells with signature scores above the mean: (I) Naïve B cell. (II) cMBC. (III) acMBC. (IV) ABC. Color scale reflects the expression score of each signature, with red and blue depicting the highest and lowest signature scores, respectively. (D) Heatmap showing the expression of the top DEGs in each cluster identified in (C). Genes cited in the results section text are indicated by asterisks. (E) Expression levels of IGHD, IGHM, and IGHG1 are shown for each of the clusters in (D). Color scales indicate average gene expression per cell.

We then projected the unique gene signatures identified from bulk RNA-seq of the four B cell subsets (Fig. 1) onto the scRNA-seq data and generated gene signature scores for each cluster (Fig. 2C, fig. S3A, and Materials and Methods). The projections captured clusters enriched for the gene signature of each subset, namely, naïve B cells, classical MBCs, activated MBCs, and ABCs (Fig. 2C). A heatmap comparing the top DEGs in each cluster is given (Fig. 2D). The naïve B cell gene signature were mapped to two distinct clusters, 0 and 2, termed naïve 1 and 2, respectively, indicating an unexpected heterogeneity in naïve B cells (Fig. 2C, I). Top DEGs in the naïve 1 cluster included the Activator Protein 1 (AP1) TF FOS and genes involved in major histocompatibility complex (MHC) class II antigen processing and presentation including MHCII genes (HLA-DRB1, HLA-DQA1, and HLA-DRA) and CD74 (Fig. 2D). B cells in the naïve 2 cluster, in contrast, highly expressed genes such as NFKBIA that encodes IκB, a suppressor of the nuclear factor κB (NF-κB) pathway, and PELI1, an E3 ligase that associates with NF-κB–inducing kinase to degrade it, thereby inhibiting noncanonical NF-κB activation (27). This cluster also expressed the highest levels of JUN and MAP3K8. Tumor necrosis factor (TNF) and Toll-like receptor (TLR) signals induce MAP3K8 phosphorylation that, in turn, leads to the activation of the MEK-ERK [mitogen-activated protein kinase (MAPK) kinase–extracellular signal–regulated kinase], c-Jun N-terminal kinase, and NF-κB pathways (28). Together, these genes play roles in TLR signaling, MyD88 cascade initiation, and TNF receptor–associated factor 6–mediated activation of NF-κB and MAPK pathways. We hypothesized that differences between the naïve B cell clusters were driven by activation. To test this possibility, we projected a gene signature for early transcriptional changes that occur within 2 hours of BCR engagement onto these clusters (29). The BCR activation signature scored significantly higher (P < 2.2 × 10−16) in the naïve 2 cluster as compared to the naïve 1 cluster (fig. S3B), indicating that B cells in the naïve 2 cluster were more activated than those in the naïve 1 cluster. In terms of BCR isotype expression as expected for naïve B cells, both clusters expressed the unswitched IgH chain genes IGHD and IGHM, although these were not among the top DEGs for these clusters and only low levels of class switched IgH chain genes (IGHG1-4 and IGHA1-2) (Fig. 2E and fig. S3C). Quantification of the isotype expression showed that B cells in the naïve 1 cluster expressed somewhat more IGHM than IGHD, in contrast to B cells in the naïve 2 cluster that expressed similar levels of IGHM and IGHD (fig. S3C). In addition, naïve 1 and, to a lesser extent, naïve 2 expressed high signature scores when mapped to previously published data of cells treated in vitro with IL-6 (fig. S4A and Materials and Methods) (30).

The classical MBC gene signature was mapped to cluster 3 and to a portion of B cells in cluster 1 (Fig. 2C, II). Cells in cluster 1 that did not map to the classical MBC gene signature also failed to map to the other three gene signatures, suggesting that these cells may be in transition or part of a yet to be characterized subset. We termed this cluster classical MBC-like B cells. Classical MBCs and classical MBC-like B cell clusters showed distinct gene expression profiles (Fig. 2D). In terms of BCR isotype expression, both clusters predominately expressed switched Ig isotype genes including IGHG1-4 and IGHA1-2 (Fig. 2E). However, both classical MBC and classical-like MBC clusters also contained un–class-switched B cells that expressed more IGHM as compared to IGHD, suggesting marginal zone–like B cells within these clusters (fig. S3C) (31). The classical MBC cluster up-regulated S100A4, a gene we previously showed bound to P. falciparum merozoite surface protein 1 that showed anti-inflammatory activity in vitro (32). Classical MBC-like B cells highly expressed GPR183 that encodes Epstein-Barr virus–induced molecule 2, a regulator of B cell localization to GCs and extrafollicular PC responses (33).

B cells in cluster 4, which scored highly in the activated MBC gene signature (Fig. 2C, III), were significantly enriched in samples from malaria-exposed donors (~80%). This cluster expressed the highest levels of genes encoding class-switched IgH chains (including IgG1-4 and IgA1-2), IgL chains (both κ and λ), J chain (a PC-specific protein that regulates polymerization of secreted IgA and IgM), and MZB1 (a marginal zone/B1-B cell specific protein that promotes IgM assembly). Cluster 4 also expressed higher levels of IGHM relative to IGHD and high levels of TFs associated with PC differentiation (34). These include PRDM1 that encodes the PC differentiation TF Blimp-1 and IRF4 that promote PC differentiation. This cluster also expresses low levels of PAX5 that suppresses J chain transcription. Further, these cells mapped to the gene signature of cells treated in vitro with IL-21, IL-4, and IL-6 (fig. S4A and Materials and Methods). These results suggest that cells in the activated MBC cluster are poised to differentiate into PCs.

The ABC gene signature was mapped to cluster 5 of the scRNA-seq data, composed primarily of B cells from malaria-exposed donors (Fig. 2C, IV). FCRL5 and the newly identified ABC marker ITGB2 encoding integrin β-2 were highly expressed in the ABC cluster (Fig. 2D). In addition, as compared to other clusters, the ABC cluster was mapped highly to a previously published signature of B cells treated in vitro with IFN-γ (fig. S4A and Materials and Methods) (15). B cells in cluster 5 were unusual in expressing higher levels of IGHD as compared to IGHM in addition to class-switched IGHG1 (Fig. 2E and fig. S3C). Analysis of sorted IgD+ ABCs by qRT-PCR confirmed that IgD+ ABCs expressed lower levels of IGHM as compared to naïve B cells and unswitched classical MBCs (fig. S4B). To explore the heterogeneity within the ABCs, we isolated and reanalyzed these cells, revealing three distinct clusters, clusters I, II, and III (fig. S4C and Materials and Methods). IGHD, IGHM, and IGHG2 were among the top 10 DEGs expressed by these clusters (fig. S4D), suggesting that the heterogeneity within cluster 5 may be driven in part by Ig isotype expression. Cluster I showed a high relative expression of IGHD but low IGHM, and cluster II showed the highest expression of IGHG2 (fig. S4D) and highly expressed ITGB7 mRNA (fig. S4E), a gene up-regulated in the ABC gene signature. Clusters I and II had a larger number of B cells that expressed the ABC marker ITGAX (CD11c) as compared to cluster III (fig. S4E). In addition, the IGHG2-expressing cells in cluster II highly expressed ITGB7 mRNA (fig. S4E), a gene up-regulated in the ABC gene signature. Last, cluster III consisted of more naïve-like cells that were un-class switched and expressed both IGHD and IGHM but had fewer cells that expressed IGHG2 (fig. S4D). Thus, the ABC cluster appeared to be heterogeneous, as defined by isotype expression.

Trajectory analysis shows ABCs as a terminally differentiated IFN-γ–driven state

To better understand which B cell states could ultimately differentiate into ABCs, we carried out scRNA-seq with parallel BCR sequencing (Materials and Methods) of B cells from two additional Malian adults, donors 303 and 478. Of the two samples, we captured more cells (6143 cells) for donor 303, with a larger expansion of ABCs (6.7%) and BCR data available for nearly 89% of all cells compared to donor 478. We therefore chose donor 303 for further analysis, although scRNA-seq data have been made available for both donors. We clustered the cells on the basis of gene expression and using the gene signatures identified by bulk RNA-seq of malaria-exposed donors (Fig. 1 and Materials and Methods) and identified three naïve B cell clusters, three classical MBC clusters, a single activated MBC, and a single ABC cluster (Fig. 3A and fig. S5A). The relatively large number of cells allowed us to use the gene expression data to create a pseudotime trajectory with naïve B cells given the value of 0 (Materials and Methods) (Fig. 3B). The cells were well distributed along the trajectory and formed three distinct branches (branches I, II, and III) with one major branch point (3) and three minor branch points (1, 2, and 4) (Fig. 3B). Cells from the three naïve B cell clusters were found along branch I (Fig. 3C), with naïve clusters 3 and 4 forming the terminus of branch I and B cells from naïve cluster 0 spread out along branch I, potentially indicating multiple differentiation/activation states. Of the classical MBC clusters, cells from MBC cluster 2 formed the terminus of branch II. MBC cluster 6 was more abundant along branch I in contrast to MBC cluster 1 cells that distributed along branches II and III and the major branch point 3. Activated MBCs were mostly limited to the minor branch points 1 and 2 on branch I, and the ABCs formed the terminus of branch III, indicating that they are indeed a terminally differentiated cell state (Fig. 3C). A projection of an IFN-γ signature onto the trajectory showed enrichment along branch III toward the ABCs (Fig. 3D). The general V and J gene usages in both Ig heavy and light chains were similar across the different clusters, aside from the expanded activated MBC (cluster 7) (fig. S5, B and C). Expanded clones were detected along branches I and II with increased expansion in the activated MBC cluster and decreased expansion in the ABCs (Fig. 3E). We then identified identical clones along the pseudotime trajectory (Fig. 3F). As expected, most identical clones were detected along the branches with high enrichment in the activated MBC branch. Many cells along the branches, as well as few B cells from the termini of branches I and II, shared clones with ABCs, indicating that different states in naïve B cells and classical MBCs may be responsible for giving rise to the ABCs cells. Branch point 3 seemed to be a major determinant of cell fate decision, and when we compared the DEGs in the cells at branch point 3 to all the other cells, we found a high expression of GPR183, TNFRSF13B indicating that these were transcriptionally similar to the MBC-like cells identified in Fig. 2, suggesting that these cells are able to differentiate into ABCs. When we analyzed the DEGs of this cluster compared to the other MBC clusters (2 and 6), we found that they expressed higher levels of CD79A, NR4A2, SYK, and IL10RA that were each identified as highly expressed in bulk RNA-seq of ABCs. This cluster also expressed JUNB and MAP3K8, indicating a likely activated phenotype.

Fig. 3 Trajectory analysis of scRNA-seq data for B cells from malaria-exposed Malian adults.

(A) scRNA-seq data from peripheral blood B cells of one Malian adult (subject ID: 303; table S1) showing 6143 cells as individual dots. Data are displayed by tSNE with each cluster colored differently and labeled on the basis of gene signatures displayed in fig. S5A. (B) Trajectory analysis of cells shown in (A) using the naïve B cell gene signature (table 2) as the root of the trajectory. Color indicates pseudotime from 0 to 16, and dot size indicate pseudotime from 0 to 16. Three branches (I, II, and III) are described in the text. (C) Distribution of cells in each cluster is shown along the trajectory path. Top: Naïve B cell clusters 0, 4, and 3. Middle: Classical MBC clusters 1, 2, and 6. Bottom: Activated MBC cluster 7 and ABC cluster 5. (D) Cells are colored on the basis of the IFN-γ signature score used in fig. S4A. (E) BCR clones that were expanded are shown in red along the trajectory path. Clones that are unique or not expanded are in gray. (F) Identical clones are traced in black lines along the trajectory path. Cells are colored on the basis of the ABC signature. Color bar and dot size indicate the signature score of the ABC gene signature.

scRNA-seq reveals similar B cell heterogeneities in malaria and HIV

ABCs are well characterized in HIV-infected individuals (35, 36). To explore the transcriptional similarity of ABCs in HIV and malaria, we performed scRNA-seq on B cells from three HIV-infected individuals not receiving antiretroviral therapy who were either recently infected (<1 year) or experienced slow disease progression (table S1 and Materials and Methods). The data were analyzed in a similar manner to the malaria-exposed dataset (Fig. 2). The integrated data from both HIV-infected and healthy donors are shown as a tSNE plot (Fig. 4A, I), and B cells originating from HIV-infected or healthy donors are shown separately (Fig. 4A, II and III). Contaminating non–B cell clusters 8, 10, and 11 were removed, and of the remaining nine clusters, clusters 6 and 7 contained the highest percent of B cells from HIV-infected individuals (81 and 70%, respectively) (Fig. 4B). To annotate the clusters, we projected gene signatures generated from bulk RNA-seq of sorted B cell subsets from HIV-infected individuals and identified clusters for each subset (Fig. 4C, I to IV). The three naïve B cell clusters 1, 3, and 4 (Fig. 4C, I) DEGs associated with B cell activation, such as Jun, JunB, Fos, IER2, and NFKBIA shown in a heatmap (Fig. 4D), indicating that, as was the case for the malaria dataset, heterogeneity in the naïve B cell clusters may be driven by levels of activation. When mapping the BCR activation signature to the three naïve B cell clusters, we found that the activation signature scores increased across each naïve B cell cluster shown in Fig. 4A (BCR activation scores are shown in fig. S6A). Classical MBC clusters 2, 0, and 5 (Fig. 4C, II) DEGs in a similar fashion to the clusters from malaria-exposed donors (Fig. 2) and activated MBCs, cluster 7, (Fig. 4C, III) highly expressing Ig genes (Fig. 4D). The ABC cluster (Fig. 4C, IV) expressed ITGB2 (Fig. 4D), which was also highly expressed in the bulk RNA-seq data from B cells from HIV-infected donors.

Fig. 4 B cells from HIV-infected individuals are heterogeneous and have transcriptionally similar ABCs to malaria-exposed donors.

(A) (I) scRNA-seq data from three HIV-infected individuals (subject IDs in table S1) and three healthy U.S. donors (subject IDs in table S1) were integrated for a total of 4506 cells displayed by tSNE. Data from (I) separated into HIV-infected (II) (2021 cells) and healthy U.S. donors (III) (2485 cells). (B) Total number and percentage of cells per cluster from HIV-infected versus healthy U.S. donors. (C) tSNE plot from (A) onto which gene signatures from bulk RNA-seq of sorted B cells from HIV-infected donors were mapped and shown as contour maps of regions with signature scores above the mean: (I) Naïve B cell. (II) cMBC. (III) acMBC. (IV) ABC. Red and blue depict the highest and lowest gene expression signature scores, respectively. (D) Heatmap showing the top DEGs in each cluster in (C). Genes cited in Results are indicated by arrows. Color scale depicts average gene expression per cell. (E) tSNE plot showing integrated data from HIV-infected donors in blue and Malian donors in orange with the ABC cluster (identified in fig. S5) gated. (I) HIV-infected donors. (II) Malaria-exposed donors. Cells are colored on the basis of the gene signature to which they mapped. Numbers in brackets refer to cluster number in the malaria data (Fig. 2C) or HIV data (Fig. 3C). (F) Gene signatures of ABCs from SLE (I), rheumatoid arthritis (RA) (II), or common variable immunodeficiency (CVID) (III) were mapped onto the malaria scRNA-seq data from Fig. 2A.

ABCs express similar transcriptional programs across chronic diseases

Given the similar heterogeneity across the B cell subsets between HIV-infected and malaria-exposed donors, we integrated and reclustered the HIV and malaria scRNA-seq data, so as to retain their previous annotation (Materials and Methods). Twelve clusters were obtained (fig. S6B) of which clusters 9 and 11 were non–B cells and were removed. The data were well integrated between HIV and malaria (Fig. 4E, I). To get at the true similarities between cells that originated from two distinct disease states, we then presented the original cell annotations based on the separate malaria or HIV analysis (Figs. 2C and 4C). The ABCs from HIV-infected and malaria-exposed donors formed a single well-merged cluster (Fig. 4E, II and III), indicating shared transcriptional programs across the two diseases. ABCs may thus represent a unique lineage that arises as a result of common drivers in these two chronic disease environments. Thus, it appears that the underlying microenvironment in both of these disease states generated a subset of cells that was not only similar by the few expressed cell surface markers that define ABCs but by their global transcriptomic profile. However, it is expected that the disease-specific environments in malaria and HIV infection will be reflected in gene expression differences between ABCs in each cluster. To this end, we compared the DEGs between ABC cluster 5 from malaria (Fig. 2A) and ABC cluster 6 from HIV infection (Fig. 4A). Malaria ABCs expressed higher levels of NR4A1 and down-regulated BATF. In contrast, HIV ABCs showed the opposite trend (fig. S6C). NR4A1 expression is rapidly induced by B cell activation through the BCR, and in the absence of costimulation from T cells, it represses BATF (basic leucine zipper transcription factor, ATF-like) and MYC (37). NR4A1, along with NR4A2, were also highly up-regulated in the ABC gene signature from bulk RNA-seq (Fig. 1, B and E). Malaria ABCs also highly expressed IRF8. Interferon regulatory factor 8 (IRF8) plays a role in maintaining peripheral tolerance and B cell anergy (38). HIV ABCs, on the other hand, highly expressed CD69, SELL (which encodes CD62L), and CD83, indicating a more activated cell state. SELL was down-regulated in the gene signature of malaria ABCs (Fig. 1B). These differences may relate to, yet to be defined, unique roles played by these cells in their specific disease environment.

In addition to HIV (13), expansion of cells with phenotypes resembling ABCs also occurs in autoimmune diseases. To compare these cells in autoimmune disease to ABCs in malaria, we generated transcriptional gene signatures of ABCs in individuals with SLE (Fig. 4F, I) (14), rheumatoid arthritis (RA) (Fig. 4F, II), and common variable immunodeficiency (CVID) (Fig. 4E, III) (39) from published datasets (14). The gene signatures from these diseases showed a high degree of overlap with the ABC cluster in the scRNA-seq data from Malian-exposed donors (Fig. 4F), indicating transcriptional similarity of this subset across different disease settings.

ABCs in malaria are composed of different subpopulations based on Ig isotype expression

To more thoroughly characterize the heterogeneity among ABCs in malaria-exposed individuals, we first validated the observation that the ABC cluster expressed a substantial amount of unswitched Ig isotype IGHD and class-switched IGHG1 but less IGHM (Fig. 2E and fig. S3C). To do so, we quantified by flow cytometry the levels of IgM, IgD, and IgG on CD19+CD20+CD10 mature B cells from a cohort of 14 Malian children. These B cells were further gated on the basis of their CD21 and CD27 expression as naïve B cells, classical MBC, ABCs, and activated MBCs (Fig. 5A). Representative flow data for one child and quantification for the 14 children are shown. As activated MBCs were less than 4% of the total B cells, they were not included in further analyses. Classical MBCs contained three populations, unswitched MBCs (IgD+IgM+ or IgD+IgMlo) (Fig. 5B, I) that were 40 and 5% of the MBCs, respectively (Fig. 5B, III), and switched IgDIgM MBCs (Fig. 5B, I) that were ~50% of MBCs (Fig. 5B, III), as previously reported (8). A large fraction of the IgDIgM classical MBCs expressed IgG (Fig. 5B, II), and IgDIgMIgG+ cells were approximately 60% of IgD IgM classical MBCs (Fig. 5A, IV). Within naïve B cells, we also identified these same three subpopulations (Fig. 5C, I and II) but at very different frequencies: IgD+IgM+ (60%), IgD+IgMlo (35%), and IgDIgM (5%) (Fig. 5C, III), as also previously reported (40). Most of the small number of IgDIgM naïve cells expressed IgG (Fig. 5C, IV). Of the ABC population, IgD+IgM+ and IgD+IgMlo were ~25% and 40 to 60%, respectively, and IgDIgM B cells comprised another ~40 to 60% of the cells (Fig. 5D, I and IV), although the percentage of the class-switched ABC population was significantly higher than unswitched ABCs. As compared to naïve B cells and classical MBCs, the ABCs had the highest expansion of IgD+IgMlo cells. The IgD+IgMlo ABCs expressed low levels of intracellular IgM (fig. S4F), suggesting that the low surface levels of IgM were not the result of internalization of IgM into the cell. In addition, we analyzed the IgD+IgM+ and IgD+IgMlo subpopulations for the expression of the ABC markers Tbet and CD11c. Only approximately 50% of IgD+IgM+ B cells expressed these markers (Fig. 5E, II), whereas nearly 100% of IgD+IgMlo B cells expressed Tbet and CD11c (Fig. 5E, III). Of the IgDIgM ABCs, more than 70% were switched to IgG (Fig. 5E, I and III) and nearly all of the IgDIgG+ ABCs also expressed high levels of Tbet and CD11c (Fig. 5E, II). This was consistent with the scRNA-seq data, which showed that IgD+IgMlo and IgG+ ABCs clusters expressed higher levels of ITGAX than cluster 2 (fig. S4D). We assessed the cell surface levels of integrin ß7 by flow cytometry on the newly identified ABCs subpopulations and found that consistent with the scRNA-seq results (fig. S4D), integrin ß7 expression was indeed higher on IgDIgG+ ABCs as compared to IgD+IgM+ and IgD+IgMlo ABCs (fig. S4G). In summary, ABCs were composed of three distinct subpopulations identified by their expression of Ig isotype, of which the IgD+IgMlo subpopulation showed the largest expansion.

Fig. 5 Heterogeneity in ABCs defined by Ig isotype expression.

(A) Flow cytometry analysis of B cells purified from peripheral blood of 14 Malian children gated as CD19+CD20+CD10cells (subject IDs indicated in table S1). A representative contour plots for one child (subject ID: 523; table S1) are shown. B cell subsets were further gated on the basis of CD21 and CD27 expression as naïve B cells (CD21+CD27), classical MBCs (cMBCs) (CD21+CD27+), and atypical B cells (ABCs) (CD21CD27). (B) (I) cMBCs were stained with antibodies specific for cell surface IgD, IgM, and IgG. (II) IgDIgM cells from (I) were assessed for IgG expression. (III) Percentages of IgD+IgM+, IgD+IgMlo, and IgDIgM cells within the cMBCs. (IV) Percentage of IgDIgG+ cMBCs in IgDIgM cMBCs. (C) (I to IV) Naïve B cells were stained and analyzed as in (B). (D) (I) Total CD21CD27 ABCs were assessed for IgD and IgM expression. IgD+IgM+ ABCs (II) and IgD+IgMlo ABCs (III) were assessed for expression levels of Tbet and CD11c. (IV) Percent of IgD+IgM+, IgD+IgMlo, and IgDIgM subsets in total CD21CD27 ABCs. (E) (I) Expression of IgG in IgDIgM ABCs. (II) Expression levels of Tbet and CD11c in IgDIgG+ ABCs. (III) Percent of IgG+ cells in IgD ABCs (*P < 0.005, **P < 0.001, and ***P < 0.0001). Comparisons of population percentages in (B) to (E) were tested for statistical significance by one-way analysis of variance (ANOVA) with Brown-Forsythe test. FSC-W, forward scatter-width; SSC-W, side scatter-width.

ABC subpopulations are differentially expanded in febrile malaria and vary in their affinity thresholds for antigen activation

Focusing further on the ABC subpopulations in malaria-exposed individuals, we determined by flow cytometry that IgD+IgMlo and IgDIgG+ ABCs represented 4.3 and 9.5%, respectively, of total Malian adult B cells and both populations were negligible in samples from unexposed U.S. donors, approximately 1% (Fig. 6A). In malaria-endemic Mali, most children are susceptible to symptomatic febrile malaria until their teens, whereas most adults are resistant, despite being susceptible to infection (41). Although the percent of total B cells that were ABCs was not significantly different in children (8 to 12 years of age) and adults (Fig. 6B), the IgD+IgMlo ABC subpopulation represented a significantly higher portion of ABCs in malaria-susceptible children as compared to adults (Fig. 6B). In contrast, the IgDIgG+ ABC subpopulation composed a larger portion of ABCs in malaria-resistant adults as compared to children (Fig. 6B).

Fig. 6 IgD+IgMlo ABCs are expanded in children upon acute febrile malaria and have high antigen affinity thresholds for activation.

(A) Percentages of IgD+IgMlo atypical B cell (ABC) (green) and IgDIgG+ ABC (orange) in total B cells of 12 Malian adult donors (subject IDs indicated in table S1) and U.S. donors (subject IDs indicated in table S1). (B) Left: Percent of CD21CD27 ABCs of total B cells in 15 Malian children (subject IDs indicated in table S1) and the 12 Malian adults in (A). Right: Percent of IgD+IgMlo (green) and IgDIgG+ ABCs (orange) in the same 15 Malian children and 12 Malian adults. (C) (I) The percent of ABC at healthy baseline (HB), upon the onset of febrile malaria (Mal), and 7 days after antimalarial treatment (7 dpt) for the Malian children in (B). Each dot represents one child. Percent of IgD+IgMlo ABCs (II), IgDIgG+ ABCs (III), and IgD+IgM+ ABCs (IV) of total ABC B cells at HB, Mal, and 7 dpt. (D) Recruitment of BCR (I), phosphorylated Igα (pIgα) (II), and phosphorylated PI3K (pPI3K) (III) to the immune synapse of B cells stimulated with either high- or low-affinity membrane-bound anti-κ. Each dot represents a single cell, and colors represent different B cell subsets (indicated on the x axis). (*P < 0.01, **P < 0.005, ***P < 0.001, and ****P < 0.0001; ns, not significant). Comparisons were tested for statistical significance by either one-way ANOVA with Kruskall-Wallis test in (A) and (B), Tukey’s range test in (C), or unpaired t test for each pair consisting of high (hi)– and low (lo)–affinity antigens (D).

In Mali, the malaria transmission season is clearly demarcated with transmission beginning in the rainy season in May when mosquitos breed and ends in December when the rains end (42). From January to April, there are no mosquitos and no malaria transmission, although children carry asymptomatic infections. We analyzed the peripheral blood B cells of the 15 children (analyzed in Fig. 6B) who were enrolled in a large longitudinal observational cohort in Mali (41). Peripheral blood was obtained from children in May, after 6 months of no malaria (healthy baseline), upon their first case of febrile malaria at which time they received fast-acting antimalarial drugs, and 7 days posttreatment (7 dpt). Although the percent of ABCs within the total B cell population did not vary over the course of the malaria season (from baseline to febrile malaria to 7 dpt) (Fig. 6C, I), the percent of IgD+IgMlo ABCs increased from baseline upon febrile malaria and returned to baseline 7 dpt (Fig. 6C, II). The numbers of IgDIgG + ABCs did not change from healthy baseline to febrile malaria but appeared to decline upon treatment to levels lower than baseline (Fig. 6C, III). In contrast, the percent of IgD+IgM+ ABCs did not change upon febrile malaria but increased 7 dpt (Fig. 6C, IV). These results suggest that IgD+IgMlo and IgDIgG+ ABCs are activated in children with febrile malaria and may play a role in controlling the acute disease. In contrast, IgD+IgM+ ABCs expanded upon resolution of the disease.

We next explored the response of these ABC subpopulations to antigen stimulation in vitro with particular focus on the affinity threshold for antigen-dependent activation. We recently provided evidence that antigen affinity thresholds are an intrinsic feature of B cells in at least two distinct differentiated states, namely, naïve B cells and GC B cells (22). Using the same experimental approach, we measured the accumulation of BCRs (Fig. 6D,I) and levels of phospho-Igα (Fig. 6D, II) and phospho–phosphatidylinositol 3-kinase (pPI3K) (Fig. 6D, III) in the immune synapse of B cells activated on planar lipid bilayers (PLBs) that contained either high- or low-affinity antibodies specific for the BCR κλ light chains, as surrogate antigens. The most significant differences between responses to high- versus low-affinity antigens were observed for IgD+IgMlo and IgDIgG+ ABCs (Fig. 6D, I to III). In both cases, B cells were less able to respond to low- versus high-affinity antigen, indicating a high antigen affinity threshold for activation. In contrast, IgD+IgM+ naïve B cells, IgD+IgM+ unswitched MBCs, and IgD+IgM+ ABCs showed little or no difference in their response to low- versus high-affinity antigens. Together, these data suggest that the expansion of IgD+IgMlo and IgDIgG+ ABCs may selectively reduce responses to low-affinity antigens.

IgD+IgMlo and IgD+IgM+ ABCs have biased Ig heavy-chain gene usage and distinct CDR3 properties

The observations that IgDIgG+ and IgD+IgMlo ABCs are expanded in malaria and have higher affinity thresholds for responses to membrane-associated antigens suggests that these subsets may harbor unique antibody repertoires selected by chronic malaria exposure and regulated via affinity thresholds. To explore this possibility, we performed a systematic analysis of the immune repertoires of ABCs from three children living in malaria-endemic Mali. Analysis of the V and D gene usage of the Ig heavy chain in ABC subsets and classical MBCs showed biased V and D gene usage between unswitched and switched ABCs (Fig. 7A and fig. S7C). IgD+IgMlo ABCs more frequently used distal heavy-chain joining region (JH) genes that are preferentially used by autoreactive B cells (43) including JH06-01, as compared to naïve B cells, IgDIgG+ ABCs, and classical MBCs (fig. S7, A and B). The expansion of ABCs and skewing of VH, DH, and JH gene usage among ABC subsets and classical MBCs suggested a role for antigen-driven selection in directing development of these subpopulations.

Fig. 7 Analysis of Ig repertoire features in B cell subsets from Malian children.

(A) VH usage between naïve, IgD+IgM+ atypical B cells (ABC), IgD+IgMlo ABC, switched IgDIgG+ ABC, and classical MBC (cMBC) are shown in a heatmap depicting fold change differences in frequency relative to naïve B cells. Frequencies of VH sequences comprising the naïve B cell compartment are given on the left. An asterisk denotes the VH genes referred to in the text. Data are from three Malian children (subject IDs: 554, 562, and 566; table S1) (B) A comparison of somatic hypermutation in VH between naïve, IgD+IgM+ ABC, IgD+IgMlo ABC, switched IgDIgG+ ABC, and cMBC analyzed by one-way ANOVA with Kruskall-Wallis test. (C) Heavy-chain CDR3 features of charge and amino acid length analyzed by one-way ANOVA with Kruskall-Wallis test. (D) Representative Kidera factor (KF) analysis of CDR3 from indicated B cell subsets, including KF2 (side-chain preference), KF3 (extended structure preference), KF7 (flat extended preference), and KF10 (surrounding hydrophobicity). (E) Comparison of Shannon index of diversity calculated from total CDR3 amino acid sequences. (F) The percentage of heavy-chain CDR3 peptide sequences shared of IgD+IgM+ ABC, IgD+IgMlo ABC, and IgDIgG+ ABC subsets. (*P < 0.05, **P < 0.01, and ***P < 0.0001).

We determined the levels of SHM in sorted naïve, classical MBC, and ABC subpopulations and found that unswitched B cell subsets (which include naïve B cells, IgD+IgM+ ABCs, and IgD+IgMlo ABCs) had lower percentages of SHMs as compared to switched B cells (which include classical MBCs and IgDIgG+ ABCs). However, IgD+IgMlo ABCs contained a significantly higher percentage of SHMs as compared to naïve B cells and IgD+IgM+ ABCs (Fig. 7B), further supporting a role for antigen-driven selection. Given the association between B cell autoreactivity and malaria, we measured physicochemical characteristics in their CDR3s and found that IgD+IgM+ ABCs expressed more positively charged CDR3s, as compared to IgD+IgMlo ABCs that expressed longer CDR3s (Fig. 7C and fig. S7E), suggesting that ABCs in malaria expressed BCRs with features associated with autoreactivity and polyreactivity. No significant differences in CDR3 hydrophobicity were observed (fig. S7D). In contrast, IgDIgG+ ABCs, on average, expressed a negatively charged CDR3 and had shorter length CDR3 and were more similar to classical MBCs (Fig. 7C). We then undertook Kidera factor (KF) analysis to evaluate additional physicochemical properties and found that the BCRs of IgDIgG+ ABCs were significantly different from the repertoires of unswitched ABCs but were similar to the classical MBC repertoire (Fig. 7D and fig. S7F). In contrast, helix/bend preference (KF1), side-chain preference (KF2), hydrophobicity (KF4), flat extended preference (KF7), occurrence in α chain (KF8), and surrounding hydrophobicity (KF10) were significantly different between IgD+IgM+ and IgD+IgMlo ABCs (Fig. 7D and fig. S7F), providing additional evidence that their repertoires contained distinct antigen-binding characteristics.

Last, to understand clonal relationships between the IgD expressing ABCs and other mature B cell subsets, we quantified CDR3 diversity and overlap. Naïve B cells had the highest BCR repertoire diversity, indicating the least antigen-driven narrowing of the repertoire, as compared to classical MBCs and IgDIgG+ ABCs, which had less diverse repertoires (Fig. 7E). IgD+IgM+ and IgD+IgMlo ABCs had the least diversity of all B cell subpopulations (Fig. 7E and fig. S7G), again suggesting a role for antigen selection for B cell differentiation into these subsets. Analysis of shared CDR3 sequences revealed that IgD+IgM+ and IgD+IgMlo ABCs shared more sequences between them (1.2%) than shared with naïve B cells, less than 0.1% for both (Fig. 7F). Switched IgDIgG+ ABCs exhibited low CDR3 sequence overlap with naïve, IgD+IgM+ ABCs, and classical MBCs. Strikingly, unswitched IgD+IgMlo and switched IgDIgG+ ABCs contained absolutely no overlap (Fig. 7F). Together, these data provide evidence that IgD+IgM+ and IgD+IgMlo ABCs are clonally distinct from naïve, switched ABCs, and classical MBCs and that their B cell repertoires exhibit distinct physicochemical features that may result from the selective pressures exerted by repeated exposures to malaria.


It was only relatively recently that we gained an appreciation that pathogens that cause chronic diseases in humans have a powerful impact on the composition of the B cell compartment (2). The expansion of ABCs in individuals with chronic infections was one of the first well-documented pathogen-driven alterations of the human B cell compartment. Here, we carried out an in-depth analysis of gene transcription of B cells at a single-cell level to access the extent to which chronic malaria and HIV infections affected the B cell compartment. We first used bulk RNA-seq of four B cell subsets (naïve, ABCs, and classical and activated MBCs) sorted using conventional markers to identify unique transcriptional profiles for these subsets. In so doing, we validated a new set of cell surface markers, cytokines, and TFs associated with ABCs that may provide insight into their function, heterogeneity, origins, and potential contributions to health and disease.

We reported the results of the first scRNA-seq comparisons of the B cell compartment in malaria-exposed and HIV-infected individuals. Our results showed that beneath the well-documented B cell subsets in malaria and HIV infection, defined by the expression of CD21 and CD27, namely, naïve B cells, classical MBCs, activated MBCs, and ABCs, a second layer of heterogeneity existed. The data not only revealed distinct B cell clusters in each of the major subsets but also identified clusters that did not share gene signatures with any of the four subsets, suggesting an even greater degree of heterogeneity yet to be explored. Within naïve B cells, we identified multiple clusters that showed different extents of activation. In both malaria and HIV, naïve B cells expressed a gene signature associated with BCR-driven activation. We also found evidence of IL-6 exposure in naïve B cell clusters in malaria. IL-6 induces CD4+ T cell and B cell activation and is well studied in malaria (44). Trajectory analysis of malaria scRNA-seq data provided evidence that the multiple naïve and classical MBC clusters may represent differentiating cell states that characterize intermediate cell types between naïve B cells and terminally differentiated cells, including class-switched classical MBCs and ABCs. A recent study provided additional new information concerning the composition of malaria-specific isotype-switched B cells in malaria-exposed Kenyan children and adults showing that P. falciparum-specific B cells were distributed in three different subpopulations, defined as ABCs, MBCs, and CXCR3+ MBCs (45). Total isotype-switched B cells from two malaria exposed adults and two unexposed adults were also analyzed by scRNA-seq, and the results provided evidence of heterogeneity within ABCs (three clusters), MBCs (three clusters), and activated B cells (two clusters) similar to our description here of new heterogeneities in all B cell subpopulations including naïve B cells that were not included in the Sutton et al. (45) analyses. A pseudotime trajectory was also provided, showing that MBCs and ABCs independently differentiated from naïve B cells. In contrast, our pseudotime trajectory clearly showed that both MBCs and ABCs differentiated from common naïve B cell precursors in a process driven by IFN-γ. Further studies are needed to provide a mechanistic understanding of how these distinct activation states and cytokine exposure in naïve B cells lead to differentiation to different cell fates and affect immunity.

Using the transcriptional information from the ABCs identified in scRNA-seq of malaria and HIV infection, we then compared the transcriptional profiles of ABCs across a spectrum of disease settings. Cells with phenotypes resembling but not identical to ABCs in malaria have been reported in other chronic infections and in autoimmune diseases and have been thought to be unresponsive (2), anergic (39), or exhausted (35). Despite such differences, we found that malaria ABCs were notably similar to those in HIV, SLE, RA, and CVID. ABCs from malaria exposure and HIV infection integrated into a single cluster that was equally represented by both diseases. These findings suggest that ABCs may represent a separate lineage with common drivers active in chronic immune environments, leading to the expansion of cells with shared but not identical transcriptional programs. The significant difference in the transcriptional programs described for ABCs in malaria and in HIV likely reflects differences in a variety of clinical features of the diseases including the disease environments and infection dynamics (adult HIV donors were actively infected in contrast to malaria donors who were not parasitemic at the time of analyses). Further characterization of the granularity and functionality of these B cells in other chronic infectious diseases and autoimmune diseases may be achieved through scRNA-seq–based profiling.

Our novel analysis of the transcriptional profiles of ABCs highlighted the importance of cytokines produced during malaria infection in regulating the function and development of ABCs. Although the differentiation of ABCs was induced by IFN-γ, as observed by scRNA-seq trajectory analysis, the malaria ABC gene signature itself included genes encoding the anti-inflammatory cytokine IL-10 and the IL-10 receptor, which we validated by qRT-PCR and flow cytometry, respectively. The malaria-driven expansion of IL-10–expressing ABCs suggests the possibility that they may play a protective role during acute febrile malaria. Similarly, a switch in the cytokines expressed by CD4+ T cell pre- and postfebrile malaria from predominately proinflammatory (including IFN-γ) to anti-inflammatory (including IL-10) was shown earlier in individuals from the same longitudinal cohort from which we drew Malian B cell donors (46). These earlier data provided evidence that upon onset of febrile malaria, CD4+ T cells produced IFN-γ to induce inflammation necessary to limit parasite growth and then switch to IL-10 production to minimize the damage of inflammation to the host.

Characterization of the isotype distribution of ABCs in malaria revealed a previously unappreciated heterogeneity in the ABC compartment. Ig isotype-based heterogeneity has been previously observed in total B cells (47) and in various human and mouse B cell compartments including ABC in human malaria (48, 49) and an FCRL5+ B cell compartment (50) observed in Plasmodium chabaudi–infected mice. However, these studies did not fully explore the function and dynamics of these B cell subpopulations. We found that within the ABC subset, three distinct subpopulations existed, defined as IgD+IgMlo, IgD+IgM+, and IgDIgG+. We followed these subpopulations longitudinally in Malian children beginning at the end of the dry season, during which there was no malaria transmission, into the wet season with the onset of their first case of febrile malaria, and then 7 days after antimalarial treatment. Although the total number of ABCs did not change over time, the percent of IgD+IgMlo ABCs increased with the onset of febrile malaria and then decreased to premalaria levels by 7 dpt. Of interest, in RA and CVID, which are associated with an expanded CD21−/lo population transcriptionally similar to ABCs in malaria, IgD+IgMlo cells make up to 30% of the CD21−/lo subset, as compared to only 5% in the CD21−/lo population in healthy controls (39).This suggests that the expansion of IgD+IgMlo ABCs may play a role in clinical disease and understanding the biology of these cells in chronic infections and autoimmunity may be crucial to understanding the acquisition of immunity.

The distinct BCR repertoires of the heterogeneous ABC compartment revealed potential implications for the origins of these subpopulations. The B cell repertoire expressed by IgD+IgMlo ABCs accumulated more SHMs as compared to naïve B cells and IgD+IgM+ ABCs but far fewer SHMs as compared to switched IgDIgG+ ABCs and classical MBCs, suggesting different degrees of participation in GC reactions. This observation also showed that unswitched ABCs are more heterogeneous in their levels of SHM than our earlier studies using bulk Ig-seq indicated (8). Recently, comprehensive confocal microscopy and quantitative imaging studies provided evidence that TbethiCD19hi ABCs in the lymph nodes of chronically infected HIV-viremic individuals distinctly accumulated outside GCs and expressed reduced GC-homing receptors (36). In addition, HIV-specific B cells were enriched in the TbethiCD19hi ABCs. These cells were shown to be related to GC B cells by BCR-based phylogenetic linkage but had lower BCR mutation frequencies and reduced HIV-neutralizing capacity, consistent with diminished participation in GC-mediated affinity selection. Our trajectory analysis revealed that classical MBCs and ABCs in malaria may be derived from a similar precursor and suggested the critical importance of the inflammatory cues that may influence their differentiation. Results presented here and published earlier (8) are consistent with these findings, which ABCs and classical MBCs shared similar replication histories and repertoire characteristics, suggesting that they are derived from a common precursor but undergo different antigen-driven differentiation pathways in malaria.

Systematic analysis of the B cell repertoire in children exposed to malaria provided additional support that IgD+IgM+ and IgD+IgMlo ABCs represent unique subsets with features distinct from naïve, switched ABCs, and classical MBCs. IgD+IgM+ and IgD+IgMlo ABCs exhibited marked VH gene usage and CDR3 features, and shared antibody sequences between naïve and unswitched ABCs were rare. In addition, CDR3 characteristics associated with autoreactivity and polyreactivity were found specifically in unswitched ABCs. For example, IgD+IgM+ ABCs were enriched in antibodies having positively charged CDR3, a property common to antibodies reactive against negatively charged self-antigens including double-stranded DNA (51). Furthermore, IgD+IgM+ ABCs exhibited biased usage of VH4-34, which confers inherent autoreactivity to B cell surface and nucleosomal self-antigens (52). In addition, our previous analysis of inherently autoreactive VH4-34–expressing IgG+ B cells and antibodies in children and young adults in malaria-endemic Mali suggested a window of time during which autoreactive B cells were activated in the setting of acute febrile malaria that waned in adulthood as immunity to malaria was acquired (12). On the basis of these findings and the distinct repertoire features of IgD+IgM+ and IgD+IgMlo ABCs, we suggest that malaria-derived antigens may modulate the development of unswitched ABCs with autoreactive potential in a manner dependent on the inflammatory milieu.

Perhaps the most remarkable feature of the IgD+IgMlo and IgDIgG+ ABCs is their high antigen affinity threshold for activation. Our recent studies provided evidence that affinity thresholds for BCR-induced activation of B cells is an intrinsic feature of the differentiated state of B cells (22). We showed that GC B cells had high antigen affinity thresholds for activation likely to facilitate the selection of high-affinity B cells during the GC process of affinity maturation. In the case of IgD+IgMlo and IgDIgG+ ABCs, we suggest that the high-affinity threshold for antigen-driven activation may function to limit the B cell response to low-affinity self- or pathogen-derived antigens during chronic infections accompanied by chronic antigen exposure. In summary, extensive transcriptional analysis identified unique subpopulations of B cells that expand in malaria and HIV and provided a wealth of information concerning the nature and function of these subpopulations that may contribute to the development of therapies to treat chronic infections and vaccines to prevent them.


Ethics statement

The studies were approved by the Ethics Committee of the Faculty of Medicine, Pharmacy, and Dentistry at the University of Sciences, Technique, and Technology of Bamako and the Institutional Review Board of the National Institute of Allergy and Infectious Diseases, National Institutes of Health (NIH). Written informed consent was obtained from participants, parents, or guardians of participating children before inclusion in the Mali study.

Donor samples from Mali and study site

The field study was conducted in Kalifabougou, Mali where intense P. falciparum transmission occurs from June to December each year. The cohort design has been described in detail elsewhere (41). Briefly, 695 healthy children and adults aged 3 months to 25 years were enrolled in an ongoing cohort study in May 2011. Children with hemoglobin level < 7 g/dl, axillary temperature of 37.5°C, acute systemic illness, underlying chronic disease, or using antimalarial or immunosuppressive medications in the past 30 days were excluded from the study. The present study was done on 31 children aged 3 to 12 years who had venous blood samples collected at their healthy uninfected baseline before the malaria season, during acute infection, and following treatment 7 days after malaria of their first acute malaria episode of the ensuing 6-month malaria season. Clinical malaria episodes were detected through active and passive surveillance and were defined as 2500 asexual parasites/μl, an axillary temperature of 37.5°C, and no other cause of fever discernible by physical exam. Malian adults aged 17 to 26 years were included in this study as well. Their clinical symptomology, wherever available, are given in table S1. All individuals with symptoms of malaria and any level of parasitemia detected by microscopy were treated according to the Malian National Malaria Control Program guidelines.

HIV-infected donor samples

Peripheral blood samples were obtained from three HIV-infected adult donors enrolled in NIH Institutional Review Board–approved protocol 02-I-0202 after providing written informed consents. Their clinical features are given in table S1.

U.S. healthy donors

Peripheral blood samples from healthy U.S. adult donors enrolled in NIH protocol no. 99-CC-0168 were used as healthy controls (table S1). Demographic and travel history data were not available from these anonymous donors, but prior P. falciparum exposure is unlikely.

Flow cytometry

For cell surface staining and sorting, cryopreserved peripheral blood mononuclear cells (PBMCs) were rapidly thawed in RPMI supplemented with 10% fetal bovine serum (FBS) and stained with indicated antibodies (table S3) in phosphate-buffered saline (PBS) with 1% bovine serum albumin and Fixable Near-IR Dead cell stain dye (Thermo Fisher Scientific) to exclude dead cells. Cells stained for sorting were kept on ice until sorted on a FACSAria cell sorter (BD Biosciences) and collected in PBS containing 1% FBS. To stain for intracellular TFs, cells were fixed and permeabilized with eBioscience FoxP3/Transcription Factor Staining Buffer Set (Thermo Fisher Scientific), and subsequent staining was carried out with antibodies diluted in permeabilization buffer. All data were collected on a BD Fortessa cell analyzer (BD Biosciences), and data analysis was done with FlowJo software (Tree Star).

RNA-seq sample processing

PBMCs from Malian adults were sorted for B cells as CD19+CD20+CD10Dump (CD3, CD14, CD16, and CD56) and further into subsets: naïve (CD21+CD27), classical MBCs (CD21+CD27+), activated MBCs (CD21CD27+), and ABCs (CD21CD27). Sorted cells were centrifuged at 4°C and resuspended in 300 μl of RLT buffer and stored at −80°C. Samples were thawed at room temperature, and 300 μl of RLT/BME was added. An aliquot of 0.5 ml of lysate was taken and passed through a Qiagen QIAshredder column to fragment genomic DNA (gDNA) in the sample. DNAs and RNAs were extracted using a Qiagen AllPrep 96 kit as described by the manufacturer (Qiagen, Valencia, CA), except each sample was treated with 27 U of deoxyribonuclease I (Qiagen, Valencia, CA) for 15 min at room temperature during extraction to further remove gDNA during RNA extraction. RNA was quantitated using the VICTOR X3 Multilabel Plate Reader (PerkinElmer, Santa Clara, CA) and Quant-iT RiboGreen RNA reagent (Molecular Probes Inc., Eugene, OR). RNA quality was also assessed using Agilent’s 2100 Bioanalyzer and the RNA Pico 6000 Kit (Agilent Technologies, Santa Clara, CA). cDNA synthesis was completed using the SMART-Seq v4 Ultra Low Input RNA Kit for Sequencing (Takara Bio USA Inc., Mountain View, CA) according to the manufacturer’s protocol. Using the guidelines for cDNA amplification, all samples underwent eight cycles except for the activated MBC samples, which were amplified for 12. The cDNA was brought up to 130 μl in volume and sheared on the Covaris LE220 (Covaris Inc., Woburn, MA) using the shearing parameters for gDNA targets of 200–base pair (bp) products with an extension of the 175-s shear time to 300 s [peak incident power (PIP), 450; 30% DF, 200 bursts/cycle, 5 min]. The sheared cDNA was purified with Agencourt AMPure XP beads (Beckman Coulter, Indianapolis, IN) and quantitated on the Qubit fluorometer (Thermo Fisher Scientific) using the high-sensitivity picogreen assay. Approximately 600 pg was used as input into the Low Input Library Prep Kit v2 (Clontech/Takara Bio USA Inc., Mountain View, CA) and prepared following the manufacturer’s protocol. Final next-generation sequencing (NGS) libraries were assessed on a Bioanalyzer 2100 using High Sensitivity chips (Agilent Technologies, Santa Clara, CA) and quantified using the Kapa Quant Kit for Illumina sequencing (Kapa Biosystems Inc., Wilmington, MA). The libraries were normalized to 0.5 nM, pooled equally, and prepared for sequencing on the NextSeq (Illumina, San Diego, CA) following the user manual guidelines for two mid output 2 × 76 cycle runs.

RNA-seq data processing

The Pipeliner RNA-seq workflow ( was used for quality assessment and differential expression. Raw NGS reads were trimmed to remove adapters and low-quality regions using Trimmomatic v0.33 and were aligned to the human hg38 reference genome and Gencode release 26 annotation using STAR v2.5.2b. Read and alignment quality were assessed using MultiQC v1.1 to aggregate quality control (QC) metrics from Picard ( Final read counts from mapped transcripts based on the combined replicates and normalized, and then differentials for each comparison were generated using DESeq2 (53) for each subset of cells. Averaged replicates were used to generate gene signatures for all B cell subsets (table S2). Heatmap figures were generated using pheatmap package (, and hierarchical clustering was plotted using the Euclidean distance method.

scRNA-seq sample processing and sequencing

B cells were isolated from frozen-thawed PBMCs using the EasySep Human B cell Isolation Kit (STEMCELL Technologies) on a RoboSep platform. Cells were encapsulated in droplets using the Chromium Controller (10x Genomics). Reverse transcription and cDNA amplification were performed using the Chromium Single Cell 3′ Kit (10x Genomics). All steps were done according to the manufacturer’s instructions. One hundred nanograms of cDNA (25% of generated cDNA) was used as input for library preparation. Libraries were sequenced using the Illumina HiSeq 3000 platform eight lanes of sequencing at 26 bp–8 bp–98 bp according to 10x Genomics sequencing recommendations.

scRNA-seq data processing

The sample replicates were split into eight HiSeq 2500 sequencing lanes, and for each replicate, the lanes were merged into a single-count matrix using Cell Ranger v3.0 (10x Genomics). Further, the resultant counts matrices for each replicate were processed using an in-house scRNA-seq pipeline that uses Seurat v3.0 (54) to initially integrate data from three Malian and three U.S. donors. The batch-corrected matrix was then filtered for cells expressing <200 genes or with >5% mitochondrial genes. After filtration, the data were normalized using quantile normalization followed by principal components analysis (PCA). Principal components (PCs) 1 to 12 were chosen for clustering analysis, as there was very little additional variance observed beyond PC 12. Cells were then clustered on the basis of PC scores using the Louvian-Jaccard method. tSNE was used to visualize single cells in a two-dimensional (2D) embedding. Gene signatures generated from the bulk RNA-seq were used for identification of clusters based on highest ranking of signature gene expression (next section). For reclustering of ABCs, cluster 5 from the data shown in Fig. 2C (IV) was isolated, followed by normalization and PCA. PCs 1 to 10 were chosen on the basis of the highest variance observed. Cells were clustered on the basis of PC scores, and tSNE was used to visualize cells in 2D space. Differential gene expression between the clusters was assessed by model-based analysis of single cell transcriptomics (MAST). Heatmaps and ridgeline plots were generated using the ggplot2 package in R.

Signature analysis of B cell activation states

Signature analysis was done as described previously (55). In addition to the bulk RNA-seq gene signatures generated in this study for malaria and HIV B cell subsets, orthogonal approaches were used to generate signatures from other studies. These included gene signatures corresponding to B cell activation through the BCR (29), SLE DN2 cells (14), RA, and CVID CD21−/lo cells (39), and signatures from cells treated with different cytokines (15, 5658).

scRNA-seq gene signature computation

Data were first scaled (z score across each gene) to remove bias toward highly expressed genes. Given a gene signature, a cell-specific signature score was computed by first sorting the normalized scaled gene expression values for each cell followed by summing up the indices (ranks) of the signature genes. For gene signatures with an up-regulated and down-regulated set of genes, two ranking scores were obtained separately, and the down-regulated signature scores were subtracted from the up-regulated signature scores. A contour plot was added on top of the tSNE space, which takes into account only those cells that have a signature score above the mean to further emphasize the region of highly scored cells.

Real-time qRT-PCR

Cells were sorted as described, and total RNA was isolated using a PicoPure RNA isolation kit (Thermo Fisher Scientific). cDNA was synthesized using SuperScript VILO (Thermo Fisher Scientific). Real-time PCR was carried out using indicated TaqMan probes (table S3) and TaqMan gene expression master mix (Thermo Fisher Scientific) on an ABI QuantStudio 6Flex qPCR machine. Relative expression was normalized using hypoxanthine-guanine phosphoribosyltransferase and 18S rRNA expression.

gDNA sequencing

Cells from Malian donors were sorted into naïve, classical MBCs, and ABCs on the basis of CD21 and CD27 expression, and ABCs were further sorted on the basis of surface expression IgD, IgM, and IgG into IgD+IgM+, IgD+IgMlo, and IgDIgG+ ABCs. gDNA was isolated using the QIAamp DNA Blood Mini Kit (Qiagen, Valencia, CA); Ig heavy chains were sequenced using the immunoSEQ platform (Adaptive Biotechnologies).

CDR3 measurements

gDNA sequencing data described above were used to analyze CDR3 sequences. VH, DH, and JH usage and frequency of shared sequences were used for initial comparisons of clonal relatedness between the various MBC subsets. To determine structural consequences at the level of antibody structure and potential antigen recognition, amino acid usage was measured and compared across subsets. KFs (which include measurements of the following: (i) α helix or bend structure, (ii) bulk, (iii) β structure preference, (iv) hydrophobicity, (v) normalized frequency of double bend, (vi) average value of partial specific volume, (vii) average relative fractional occurrence in Eo, (viii) normalized frequency of α region, (ix) pK-C, and (x) surrounding hydrophobicity in β structure) were calculated from CDR3 amino acids as an additional measure of BCR repertoire using the BRepertoire package (59) in R. Sequence overlap was calculated as the percentage of shared amino acid CDR3 sequences divided by the total number of sequences from each subset (A∩B/A∪B).

Affinity discrimination by B cell subsets

For testing the ability of various B cell subsets to discriminate between high- and low-affinity antigens, B cells were sorted by the expression of CD21 and CD27 into each subset: naïve B cells (CD21+CD27), classical MBC (CD21+CD27+), and ABC (CD21CD27) and stimulated on PLB containing antigens. BCR signaling was measured by quantifying the recruitment of BCR and downstream signaling molecules to the immune synapse. Anti-human Igκ and anti-rat Igκ were used as surrogate antigens to test the B cell response to high-affinity antigens and low-affinity antigens, respectively, as described earlier. PLB was prepared as previously described (22) in individual chambers formed by securing CultureWell reusable silicone gasket (Grace Bio Labs) to 24 mm by 50 mm, no. 1.5 borosilicate coverglass. Briefly, 110 μM small unilamellar vesicles consisting of 1,2-dioleoyl-sn-glycerol-3-phosphocholine and 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine-N-(cap biotinyl) (Avanti Polar Lipids) in a ratio of 100:1. To bind anti-human Igκ or anti-rat Igκ to the PLB, the wells containing PLB were incubated at room temperature with streptavidin (2.5 μg/ml) for 10 min, followed by biotinylated mouse anti-human Igκ or mouse anti-rat Igκ (1 μg/ml) (Thermo Fisher Scientific) for 20 min. The BCRs on sorted B cells were labeled with DyLight 594–conjugated goat anti-human IgM, Fc5μ fragment–specific Fab fragment (10 μg/ml; Jackson ImmunoResearch) and Alexa Fluor 488–conjugated goat anti-human IgD Fab fragment (10 μg/ml; Southern Biotech) or DyLight 594–conjugated goat anti-human IgG, Fcγ fragment–specific Fab fragment (10 μg/ml; Jackson ImmunoResearch). Around 2000 to 3000 B cells were dropped in chambers containing PLB–anti-human Igκ or PLB–anti-rat Igκ and incubated at 37°C for 10 min, followed by fixation with 4% paraformaldehyde. For intracellular phospho-Ig α (plg-α) and pPI3K staining, cells were permeabilized with 0.1% Triton X-100 for 10 min at room temperature, followed by overnight staining at 4°C with antibodies against plg-α (pY182) and pPI3K p85(Y458)/p55(Y199) (Cell Signaling Technology). Images were acquired using Nikon TIRF microscope system equipped with an CFI HP Apochromat TIRF 100×/1.49 oil objective lens (Nikon); a TIR controlling system; EMCCD camera; and 405-, 488-, 561-, and 640-nm laser lines controlled by NIS-Elements (Nikon). Mean fluorescence intensity values of BCR, plg-α, and pPI3K within the immune synapse were calculated from background-subtracted images using MATLAB (MathWorks) software.

Pseudotime and BCR trajectory analysis

Cells from one Malian donor were clustered using Seurat 3.0. Ten PCs were included, and nine clusters were obtained, of which cluster 7 expressed CD3E and was removed. Samples were analyzed using Monocle DDR-Tree, a method based on reverse graph embedding (60). Single-cell trajectory analysis allowed us to investigate the development rout of a certain type (or subtype) of cells within a single sample. Here, we analyze B cells at different differentiation states. By using trajectory analysis, we organize them along the developmental cell state, creating a differentiation tree. This algorithm gives each cell a pseudotime value, which represents the distance (relative time) of every cell in the dataset from a group we defined as the root of the tree. We decided that the root of the trajectory would be the state with the highest naïve B cell signature score. For displaying the ABC signature on the trajectory, we used the signature score function, as was previously described.

V and J gene usage from scRNA-seq data

For the BCR analysis, we integrated BCR data into the Seurat object based on the cell barcodes. V and J genes that were expressed on at least 10 cells were further analyzed. Percentage of cells expressing these genes was calculated for each in cluster.

Quantification and statistical analysis

Values are expressed as means ± SEM, as indicated in figure legends wherever applicable. Statistical analyses were performed by chi-square test, Mann-Whitney test, or one-way analysis of variance (ANOVA) test followed by Brown-Forsythe test and Kruskall-Wallis test, unless otherwise indicated. Differences were considered statistically significant at P < 0.05 and are reported for each figure part where they are applicable. Analyses were performed with Prism (GraphPad Prism version 7.0).

Data availability

The sequence data generated in this study have been deposited in the Gene Expression Omnibus as a superseries under the accession number GSE149729. The software used for individual and integrated analyses are described and referenced in the individual sections in Materials and Methods. The data needed to evaluate the conclusions in the paper are present in Supplementary Tables.


Supplementary material for this article is available at

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank the residents of Kalifabougou, Mali for participating in this study. We also thank A. J. Athman for assistance with figures and artwork and K. Kanakabandi and S. Ricklefs for bulk RNA-seq sample processing and library preparation. Funding: This work was supported by the Intramural Research Program of the NIH, National Institute of Allergy and Infectious Diseases. A.M. was supported by the Alon Fellowship for Outstanding Young Scientists, Israel Council for Higher Education. Author contributions: Conceptualization: P.H., A.M., and S.K.P. Methodology: P.H., B.D., E.G., and A.M. Formal analysis: P.H., B.D., N.R., E.G., A.K.B., J.W.A., and A.M. Investigation: P.H., P.L., L.K., D.S., and A.A.A. Resources: S.M., L.Y., and S.K.P. Writing (original draft): P.H., N.R., A.M., and S.K.P. Software: A.M. Supervision: A.M., H.S., and S.K.P. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Raw data files are deposited at the Gene Expression Omnibus (GEO) as a superseries (Accession umber: GSE149729). Additional data related to this paper may be requested from the authors.

Stay Connected to Science Advances

Navigate This Article