Research ArticleDISEASES AND DISORDERS

Necrotizing enterocolitis is preceded by increased gut bacterial replication, Klebsiella, and fimbriae-encoding bacteria

See allHide authors and affiliations

Science Advances  11 Dec 2019:
Vol. 5, no. 12, eaax5727
DOI: 10.1126/sciadv.aax5727
  • Fig. 1 Metagenomic characterization of 1163 samples from 160 premature infants.

    (A) Schematic of metagenomics versus genome-resolved metagenomics. Metagenomics involves DNA extraction from a microbiome sample, followed by library preparation and sequencing. In genome-resolved metagenomics, this is followed by sequence assembly and binning to generate draft-quality microbial genomes. (B) Metagenomes were characterized using database-free and database-reliant methods. The number of features in each category is listed in parentheses. See Materials and Methods for details. (C) Flow chart of the 160 premature infants recruited for inclusion in this study from the same neonatal intensive care unit over a 5-year period. Pre-NEC and control samples are a subset of the total fecal samples that are matched for DOL, gestational age, and recent antibiotic administration (Ab), and for NEC infants, samples are within 2 days before NEC diagnosis. The median and SD of matched metrics are reported.

  • Fig. 2 Comparison of microbes in premature infants who do and do not develop NEC.

    (A) The compositional profile of microbes colonizing infants who were and were not diagnosed with NEC. Bacteria were classified on the basis of their phyla, and other microbes were classified on the basis of their domain. Each color represents the percentage of reads mapping to all organisms belonging to a taxon, and the stacked boxes for each sample show the fraction of reads in that dataset accounted for by the genomes assembled from the sample. Proteobacteria were subdivided into the family Enterobacteriaceae and other. All relative abundance values were averaged over a 5-day sliding window. Boxplots show the DOL in which samples were collected (top) and in which infants were diagnosed with NEC (bottom). (B) Principal components analysis (PCA) based on weighted UniFrac distance for all samples from NEC infants (red) and control infants (black). (C and D) Percentage of NEC infants versus the percentage of non-NEC infants colonized by strains of (C) bacteria or (D) bacteriophage (gold) and plasmids (blue). The taxonomies of four strains with extreme values are provided, of which only K. pneumoniae strain 242_2 is significantly enriched in NEC samples (P < 0.05, Fisher’s exact test). Colonization by bacteria is defined as the presence of a strain at ≥0.1% relative abundance. Plasmid and bacteriophage detection required a read-based genome breadth of coverage of ≥50%. Each dot represents a strain, and dashed lines show a 1:1 colonization rate.

  • Fig. 3 Bacterial replication rates are significantly higher before NEC development.

    (A) Replication rates for bacterial groups relative to day of NEC diagnosis. Dots represent the mean value for each group on each day, and error bars represent SEM. DOL in which growth rates were calculated from at least five infants are shown. (B) Growth rates in control (white) versus pre-NEC (gray) samples. P values shown from Wilcoxon rank sum test.

  • Fig. 4 ML identifies differences between pre-NEC and control samples.

    (A) Sum of all individual importances for each feature category. The number of features in each category is listed in parentheses. (B) Importance of all individual features associated with NEC with classifier importances over 1%. (C) Signed importances of all individual KEGG modules (top, red) and secondary metabolite clusters (bottom, blue). Negative values are negatively associated with pre-NEC samples, and positive values are positively associated with pre-NEC samples. (D and E) Relative abundance of genomes enriched in important KEGG modules (D) and important secondary metabolite-enriched genomes (E) in pre-NEC versus control samples. P values shown from Wilcoxon rank sum test. (F) Distribution of genomes enriched in important KEGG modules (red stars) and important secondary metabolite clusters (blue stars) around a phylogenetic tree of all recovered bacterial genomes. Genomes enriched in important KEGG modules are more clustered on the tree than those enriched in important secondary metabolite clusters.

  • Fig. 5 Genomes encoding fimbriae are associated with NEC development.

    (A and B) Association of protein clusters with pre-NEC samples (A) and organisms of interest (B). Each dot represents a protein cluster. Recall is (A) the number of pre-NEC samples the cluster is in/the total number of pre-NEC samples and (B) the number of organisms of interest the cluster is in/the total number of organisms of interest. Precision is (A) the number of pre-NEC samples the cluster is in/the number of total pre-NEC samples and (B) the number of organisms of interest the cluster is in/the total number of genomes the cluster is in. Clusters annotated as fimbriae are marked with red stars. Contour lines are drawn to indicate density. (C) The number of bacterial genomes encoding each fimbriae cluster, the species-level phylogenetic profile of genomes encoded by each fimbriae cluster, and each cluster’s association with NEC. (D) Phylogenetic tree of CU proteins built using IQtree. Three amino acid sequences from each de novo CU cluster and three reference amino acid sequences from each defined CU clade were included in the tree. Colors mark the phylogenetic breadth spanned by reference sequences, and stars represent de novo CU clades. For all de novo clusters, the three randomly chosen sequences fell extremely close to each other on the tree.

  • Fig. 6 Biomarkers of NEC are most informative closer to NEC diagnosis.

    The effect size for difference of each feature in pre-NEC versus control samples is shown based on a Wilcoxon rank sum test over a 2-day sliding window (e.g., −5 compares samples collected from −6 to −4 days relative to NEC diagnosis to control samples). Comparisons with P < 0.05 are marked with asterisks.

Supplementary Materials

  • Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/5/12/eaax5727/DC1

    Table S1. Metagenomic sequencing depth and read quality information.

    Table S2. Patient metadata.

    Table S3. Information about dereplicated secondary metabolite clusters, de novo–assembled genomes, and genome-wide importances of genomes.

    Table S4. Accuracy of ML algorithms and protein clustering algorithms and mapping-based abundances of bacterial taxa.

    Table S5. Full feature table provided to the ML classifier and importances of all features resulting from the ML classifier.

    Table S6. Proteins enriched in genomes of interest and identified fimbrial genes.

    Fig. S1. Metagenomic characterization of 1163 samples from 160 premature infants.

    Fig. S2. Fecal samples taken before NEC diagnosis have a higher abundance of plasmids from specific bacterial taxa.

    Fig. S3. PCA is unable to separate pre-NEC and control samples.

    Fig. S4. ML feature importance values reveal organismal associations with NEC.

  • Supplementary Materials

    The PDFset includes:

    • Fig. S1. Metagenomic characterization of 1163 samples from 160 premature infants.
    • Fig. S2. Fecal samples taken before NEC diagnosis have a higher abundance of plasmids from specific bacterial taxa.
    • Fig. S3. PCA is unable to separate pre-NEC and control samples.
    • Fig. S4. ML feature importance values reveal organismal associations with NEC.
    • Legends for tables S1 to S6

    Download PDF

    Other Supplementary Material for this manuscript includes the following:

    • Table S1 (.csv format). Metagenomic sequencing depth and read quality information.
    • Table S2 (Microsoft Excel format). Patient metadata.
    • Table S3 (Microsoft Excel format). Information about dereplicated secondary metabolite clusters, de novo–assembled genomes, and genome-wide importances of genomes.
    • Table S4 (Microsoft Excel format). Accuracy of ML algorithms and protein clustering algorithms and mapping-based abundances of bacterial taxa.
    • Table S5 (Microsoft Excel format). Full feature table provided to the ML classifier and importances of all features resulting from the ML classifier.
    • Table S6 (Microsoft Excel format). Proteins enriched in genomes of interest and identified fimbrial genes.

    Files in this Data Supplement:

Stay Connected to Science Advances


Editor's Blog

Navigate This Article