Research ArticleGENETICS

Genomic determinants of speciation and spread of the Mycobacterium tuberculosis complex

See allHide authors and affiliations

Science Advances  12 Jun 2019:
Vol. 5, no. 6, eaaw3307
DOI: 10.1126/sciadv.aaw3307
  • Fig. 1 No ongoing recombination within the MTBC.

    (A) Number of homoplasies (gray) as a function of the total number of variants detected (orange) in the MTBC dataset (n = 1591). (B) Linkage disequilibrium (LD) as a function of genetic distance detected in a representative sample of MTBC strains (n = 1591). (C) Site frequency spectrum of MTBC strains using the core variant positions. bp, base pair.

  • Fig. 2 Genome-wide variant profiles vary between MCAN, M. tuberculosis, and the MTBC ancestor.

    (A) Schematic view of the phylogenetic relationships between the MCAN groups and the MTBC. In fig. S1, a maximum likelihood phylogeny of the MCAN group including the MTBC ancestor can be found. (B) Number of homoplasies (gray) as a function of the total number of variants detected (orange) in the MCAN dataset and in the branch leading to the most recent common ancestor (tMRCA) of MTBC. Black dots indicate recombination events detected in the branch leading to the most recent common ancestor (tMRCA) of the MTBC.

  • Fig. 3 Past recombination between MCAN strains and the MTBC ancestor.

    (A) Histogram distribution of the recombination fragment ages using the 5-ka (thousand year) scenario (54). A more detailed view can be found in fig. S3, with the confidence intervals plotted. (B) Gene Ontology terms overrepresented in the coding regions contained in the recombinant fragments. Adj., adjusted; BH, Benjamini-Hochberg.

  • Fig. 4 Divergent positions between the MTBC ancestor and the MCAN clade.

    (A) Average of divSNPs per 10-kb positions (green) as compared to the average of homoplastic variants (gray). The blue arrowheads above the distribution indicate genes that significantly accumulate more divSNPs. (B) Accumulation of divSNPs per gene, corrected by gene length. A small number of genes accumulate a high amount of divSNPs, while most of the genes have a low number of variants or even none. This pattern resembles those of high habitat overlap derived from overlapping habitat models (2).

  • Fig. 5 Genes with differential selective pressures across the MTBC speciation stages.

    (A) Genes changing selective pressure in the branch of the MTBC ancestor as compared to extant MTBC strains. Red lines mark those genes being outliers of the dN/dS variation distribution. (B) phoR and phoP show different selective pressure dynamics. In both cases, the accumulation of nonsynonymous (dN) or synonymous (dS) mutations through time is measured as the distance to the most common ancestor of the MTBC. The dN and dS values have been corrected by the number of branches in the phylogeny at each time point.

  • Fig. 6 phoR is under positive selection in human-affecting strains.

    (A) Genome-based phylogeny calculated from a total of 4595 clinical samples obtained from different sources. The synonymous and nonsynonymous variants found in phoR are mapped to the corresponding branch. Variants in internal branches affect complete clades, which are colored in the phylogeny. Homoplasies are marked in the outer circle of the phylogeny. The asterisk marks the G71I phoR variant common to lineages 5 and 6 previously reported by Gonzalo-Asensio et al. (23). (B) Relative age distribution of the nonsynonymous phoR variants in the reference dataset from Coll et al. (16) (left plot) and the transmission dataset from Guerra-Assunção et al. (24) (middle plot) in comparison with the rest of the nonsynonymous genomic variants. In addition, the relative ages of the phoR nonsynonymous variants exclusive of each dataset are shown (right plot). (C) Schematic view of PhoR with the amino acid (AA) changes found across the 4595-sample dataset marked on it. Amino acid changes are significantly more abundant in the sensor domain (P < 0.01). ATP, adenosine 5′-triphosphate. HAMP, Histidine kinase, adenylyl cyclase, methyl-accepting protein and phosphatase domain; DHp, dimerization and histidine phosphotransfer.

Supplementary Materials

  • Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/5/6/eaaw3307/DC1

    Supplementary Text

    Fig. S1. Maximum likelihood phylogeny of the MCAN group, including the most likely inferred ancestor of MTBC.

    Fig. S2. Phylogenetic incongruence test.

    Fig. S3. Recombination fragments ages derived from BEAST.

    Table S1. Variants identified as homoplastic and phylogenetically convergent.

    Table S2. Potential recombination fragments detected between the MTBC ancestor and MCAN.

    Table S3. Results of the phylogenetic comparison of genes having a significant accumulation of divSNPs.

    Table S4. Analysis of dN/dS variation between the MTBC ancestor and the MTBC.

    Table S5. Codons with strong evidence of being under positive selection as detected by FUBAR.

    Table S6. Variants found in the phoR gene.

    Table S7. Accession numbers and description of the MTBC strains analyzed.

    Table S8. Accession numbers of the mycobacterial genomes used to construct the reference phylogeny.

    References (6163)

  • Supplementary Materials

    The PDF file includes:

    • Supplementary Text
    • Fig. S1. Maximum likelihood phylogeny of the MCAN group, including the most likely inferred ancestor of MTBC.
    • Fig. S2. Phylogenetic incongruence test.
    • Fig. S3. Recombination fragments ages derived from BEAST.
    • References (6163)

    Download PDF

    Other Supplementary Material for this manuscript includes the following:

    • Table S1 (Microsoft Excel format). Variants identified as homoplastic and phylogenetically convergent.
    • Table S2 (Microsoft Excel format). Potential recombination fragments detected between the MTBC ancestor and MCAN.
    • Table S3 (Microsoft Excel format). Results of the phylogenetic comparison of genes having a significant accumulation of divSNPs.
    • Table S4 (Microsoft Excel format). Analysis of dN/dS variation between the MTBC ancestor and the MTBC.
    • Table S5 (Microsoft Excel format). Codons with strong evidence of being under positive selection as detected by FUBAR.
    • Table S6 (Microsoft Excel format). Variants found in the phoR gene.
    • Table S7 (Microsoft Excel format). Accession numbers and description of the MTBC strains analyzed.
    • Table S8 (Microsoft Excel format). Accession numbers of the mycobacterial genomes used to construct the reference phylogeny.

    Files in this Data Supplement:

Navigate This Article