Research ArticleMICROBIOLOGY

Specificity and mutagenesis bias of the mycobacterial alternative mismatch repair analyzed by mutation accumulation studies

See allHide authors and affiliations

Science Advances  12 Feb 2020:
Vol. 6, no. 7, eaay4453
DOI: 10.1126/sciadv.aay4453


The postreplicative mismatch repair (MMR) is an almost ubiquitous DNA repair essential for maintaining genome stability. It has been suggested that Mycobacteria have an alternative MMR in which NucS, an endonuclease with no structural homology to the canonical MMR proteins (MutS/MutL), is the key factor. Here, we analyze the spontaneous mutations accumulated in a neutral manner over thousands of generations by Mycobacterium smegmatis and its MMR-deficient derivative (ΔnucS). The base pair substitution rates per genome per generation are 0.004 and 0.165 for wild type and ΔnucS, respectively. By comparing the activity of different bacterial MMR pathways, we demonstrate that both MutS/L- and NucS-based systems display similar specificity and mutagenesis bias, revealing a functional evolutionary convergence. However, NucS is not able to repair indels in vivo. Our results provide an unparalleled view of how this mycobacterial system works in vivo to maintain genome stability and how it may affect Mycobacterium evolution.


The maintenance of genome stability and low mutation rates is ensured in cells by different DNA surveillance and correction processes. These include base selection, proofreading, mismatch repair (MMR), base and nucleotide excision repair, recombination repair, and nonhomologous end joining. The MMR system is a sophisticated DNA repair pathway that detects and removes replication-generated mismatched bases avoiding the acquisition of spontaneous mutations. It also recognizes mismatches in recombination intermediates, inhibiting recombination between nonhomologous DNA sequences (homeologous), and plays important roles in DNA damage responses (1, 2). The canonical MMR pathway includes members of the MutS and MutL protein families, which perform key steps in mismatch correction, and it is highly conserved among the three domains of life with some important exceptions, such as the phylum Actinobacteria and several groups of Archaea (3, 4). Loss of this pathway leads to high rates of mutation (hypermutability) and increased recombination between nonhomologous DNA sequences, disrupting the genetic interspecies barrier (5).

Mycobacteria are a biologically diverse group of bacteria, which includes relevant human pathogens, such as M. tuberculosis, M. leprae, and M. avium, and were defined as devoided of MMR (6). However, we have recently shown that they have a putative noncanonical MMR pathway, whose main factor is the DNA repair protein NucS/EndoMS. NucS has no structural homology with the key canonical MMR factors MutS and MutL (4, 7). According to in silico analysis, the mycobacterial NucS is a putative mismatch-specific endonuclease, required for mutation avoidance and anti-recombination in vivo, both hallmarks of the canonical MMR (4). Archaeal NucS is able to recognize and bind mismatched pair bases in the DNA in vitro, activating its endonuclease activity to promote the mismatch removal (8, 9). In addition, this protein corrects replication errors, acting cooperatively with the sliding clamp of the replisome in Corynebacterium glutamicum (8, 10). Computational studies support an archaeal origin for NucS, built upon two distinct domains that suffered a complex evolutionary history of transfers and/or losses (4). Therefore, distinct pathways for MMR seem to have evolved at least twice in nature (4).

Inactivation of NucS in different actinobacterial species leads to the emergence of hypermutator strains with increased spontaneous mutation rates (4, 8, 10). Some bacterial isolates with impaired MMR capacity have been detected in nature, suggesting that modulation of MMR activity can be adaptive (1113). Similarly, previous studies suggested the existence of naturally occurring M. tuberculosis mutator strains defective in MMR activity, which could be particularly improved for adaptation to antibiotic regimes (4).

However, despite its major importance, the in vivo specificity and mutagenesis bias of this putative alternative MMR remained unknown. Because mutations are the source of variation upon which natural selection acts, analysis of mutations that accumulate in a neutral fashion is essential to understand how bacterial pathogens, such as those belonging to the Mycobacterium genus, evolve virulence and antibiotic resistance. Therefore, in this work, we have performed mutation accumulation (MA) experiments to understand how this putative noncanonical MMR system of mycobacteria works in vivo using both a wild-type strain of M. smegmatis and its noncanonical MMR-deficient isogenic derivative. The results provide a picture of the in vivo noncanonical MMR genome maintenance activity in mycobacteria and the dynamic of mutation across the whole genome over time. A set of reporter plasmids containing specific mutations was also used to reveal the activity of NucS on the correction of each type of mutation detected in the MA evolution. Together, these results allow us to establish definitively the biological role of the mismatch-specific nuclease NucS/EndoMS as guardian of the genome stability and its functional similarity to the canonical MMR.


MA experiments

In a recent study, we showed that a M. smegmatis nucS-null strain, impaired in the noncanonical MMR pathway, displayed a hypermutable phenotype, with increased rate of spontaneously emerging rifampicin and streptomycin resistances above the wild-type strain (150- and 86-fold, respectively) (4). These results were equivalent to those observed for an MMR-deficient Escherichia coli mutant (14, 15). It is now essential to understand and decipher how mutations emerge across the whole genome in a mycobacterial strain lacking this noncanonical MMR and whether this pathway parallels the activity of the canonical MMR.

Analysis of the mutational profile of M. smegmatis mc2 155 wild type and its ΔnucS derivative was carried out by MA experiments and whole-genome sequencing (see Materials and Methods). MA experiments were carried out for 50 weeks with 11 independent lineages for each parental strain, which accounted for 1372 and 1333 generations per line for wild type and ΔnucS, respectively (or a total of 15,095 and 14,662 generations, respectively) (Table 1). Every 7 days, a single colony from each line was restreaked onto a fresh plate, minimizing the effective population size. This type of single-colony MA experiments with a very stringent bottleneck has been demonstrated to minimize selection forces, allowing mutations to accumulate in a neutral unbiased manner (15, 16).

Table 1 Total numbers and rates of mutation identified in MA experiments.

View this table:

Mutation rates

Sequence analysis of the evolved strains revealed the total number of mutations identified in the whole genome in each strain (Table 1 and Fig. 1A), including base pair substitutions (BPSs) plus insertions/deletions (indels) (Table 2 and Fig. 1, B and C). A total number of 80 mutations were detected (the median value number of mutations per genome is 7) in the wild-type M. smegmatis MA lines. The ΔnucS lines accumulated a total of 2444 mutations (the median value of mutations per genome is 215). The average of sequence coverage per line corresponds to >99.9% of the M. smegmatis reference genome (GenBank NC_018289.1).

Fig. 1 Whole-genome mutations of the evolved strains.

Total number of mutations (A), BPSs (B), and indels (C) of the evolved wild-type M. smegmatis mc2 155 (blue circles) and ∆nucS (red circles) strains after performing the MA experiment. The horizontal straight segment represents the median value of mutations in each case.

Table 2 Rates of mutation specified by BPSs and indels.

View this table:

Considering the number of generations along the evolution experiment, global mutation rates can be measured in the whole genome for each division event. Our results indicated that the wild-type mc2 strain maintained a low mutation rate, 0.005 mutations per genome per generation (7.58 × 10−10 mutations per nucleotide per generation), while the ΔnucS strain reached 0.167 mutations per genome per generation (238.53 × 10−10 mutations per nucleotide per generation). Therefore, the absence of NucS/EndoMS led to a 31-fold increase in the total mutation rate across the whole genome (Table 1).

To evaluate the effect of selective pressure in the mutational profile generated in our MA experiments, we studied the proportion of mutations in coding versus noncoding DNA, assuming that the intergenic regions are exposed to lesser selective pressure. When mutations accumulate in a neutral manner, the proportion of mutations in coding versus noncoding DNA fits the proportion between the sizes of coding (genes) versus noncoding sequences (intergenic regions). Coding regions cover 90% of the M. smegmatis genome (16). Our MA experiments revealed that 86.9% (wild type) and 92.6% (ΔnucS) of all BPSs are located in the coding regions. Both values are not significantly different from the expected 90% (X2 = 16.67, P = 0.95 for the wild type and X2 = 0.97, P = 0.95 for ΔnucS) (critical X2 = 18.307, DF = 10) (table S1). Moreover, in the absence of selection, the ratio of nonsynonymous to synonymous mutations should not be significantly different from the random expectation, considering that synonymous mutations are relatively neutral. Our MA data show ratios of 2.78 (39/14) and 2.97 (1681/565) for wild-type and ΔnucS lines, respectively (table S1). These values are very close to the theoretical value previously defined to be 2.60 for M. smegmatis (16), indicating no bias against nonsynonymous mutations. Therefore, we conclude that selection does not affect significantly the distribution of mutations in our MA experiments.

BPSs and indels

Across the sequenced wild-type M. smegmatis MA lines, a total of 61 BPSs were identified (Fig. 1B and Table 2). The overall BPS rate per nucleotide per generation was 5.78 × 10−10, equivalent to 0.004 per genome per generation. The MA analysis also revealed 19 short insertions and deletions of 1 to 30 base pairs (bp) in length (most of them ±1 bp) (Fig. 1C and Table 2). The indel rate per site per generation was 1.80 × 10−10, and the insertion/deletion ratio was 2.17 (13 insertions and 6 deletions) (Table 2). The rates of total mutations, BPSs, and indels in our MA evolution experiments are similar to the ones reported for M. smegmatis wild-type strain in previous studies (16, 17).

ΔnucS lines accumulated a total of 2426 BPSs (Fig. 1B and Table 2), with an overall mutation rate of 2.36 × 10−8 per nucleotide per generation or 0.17 per genome per generation. Therefore, loss of nucS led to a ~41-fold increase in BPS rate with respect to the wild-type lines. The number of short insertions and deletions of 1 to 66 bp (most of them ±1 bp), a total of 18 (Fig. 1C and Table 2), was almost identical to the wild-type lines, with an indel rate for the ΔnucS strain of 1.76 × 10−10 per site per generation and an insertion/deletion ratio of 3.5 (14 insertions versus 4 deletions) (Table 2).

The functional context of each BPS (table S1) was identified using the annotated M. smegmatis mc2 155 genome [National Center for Biotechnology Information (NCBI) accession: NC_018289.1]. Although values of BPSs in coding regions are close to the expected results for both wild-type and ΔnucS lines (see above), the coding/noncoding ratios were statistically different between wild-type and ΔnucS lines [6.62 (53/8) and 12.5 (2246/180), respectively; Fisher’s exact test, P = 0.04]. Therefore, these results suggest that the noncanonical MMR preferentially prevents errors in the coding DNA. This fact has also been described for the canonical MMR in E. coli (15).

Mutational spectrum and bias

The mutational spectrum of M. smegmatis wild type revealed by MA experiments indicates a high proportion of transitions, comprising 62.3% (38 transitions) of all BPSs (Table 2). The most common type of substitution is the G:C>A:T mutation, representing 45.9% of total changes (28 of 61 total) and 73.7% of all transitions (28 of 38) (Table 2). However, the proportion of transversions reached more than one-third of all the identified BPSs, with 37.7% (23 transversions). Both the spectrum and the transition/transversion ratio of 1.65 are very similar to the previously described for wild-type M. smegmatis (16).

The absence of the noncanonical MMR leads to a substantial change in the spectrum of BPSs. The effect of loss of the noncanonical MMR was to considerably increase the number of both types of transitions. ΔnucS lines accumulated 2388 transitions (98.4%) and 38 transversions (1.6%), with a transition/transversion ratio of 62.8, 38-fold higher than that of wild-type lines. Moreover, if we focus only on transitions, the preferred NucS target, the mutation rate per genome per generation increased ~65-fold in the ΔnucS strain (162.86 versus 2.52; Tables 2 and 3). This indicated that this repair system acts preferentially recognizing and repairing transition errors. In the ∆nucS strain, A:T>G:C and G:C>A:T events rose markedly when compared to those detected in the wild type (1429 versus 10 A:T>G:C and 959 versus 28 G:C>A:T, respectively). However, only minor or no changes can be seen in the four types of transversions when wild-type and ∆nucS strains were compared (Tables 2 and 3).

Table 3 Mutation rates in wild-type and MMR-deficient strains from bacterial species as obtained from MA datasets.

View this table:

A shift in the BPS bias occurred in the absence of the noncanonical MMR. While the wild-type strain presented a mutational bias for G:C>A:T, it changed to A:T>G:C in the ∆nucS strain. In the NucS-deficient strain, A:T>G:C BPSs represented 58.9% of all changes (1429 of 2426) and 59.8% of the observed transitions (1429 of 2388). This suggests that one of the activities of the noncanonical MMR seems to control the increase in genomic G:C content (see Discussion).

DNA strand bias

M. smegmatis genome is divided into two replicores (right and left) that replicate in opposite directions, starting at the origin of replication and ending at the terminus in the Ter region, which is located at dif sites (18). The leading and lagging strands are switched in the two replicores relative to the conventional 5′-3′ numbering system. Assuming that most mutations are the result of replication errors, the A:T>G:C base substitutions should result from C mispaired with a template A or G mispaired with a template T.

Table S2 shows that the most prominent BPS in ∆nucS, A:T>G:C transitions, occurred almost twice as likely with A rather than with T (444 As versus 235 Ts) templating the lagging strand (right replicore). However, A:T>G:C transitions occurred almost twice as likely with T rather than with A (494 Ts versus 256 As) templating the leading strand (left replicore). Regarding G:C>A:T transitions, they were almost twice as likely to occur with C rather than with G (289 Cs versus 171 Gs) templating the lagging strand; the opposite happened in the leading strand (309 Gs versus 190 Cs). Therefore, NucS seems to balance the asymmetry of mutations detected in each DNA strand as seen in canonical MMR (15).

Analysis of the mutation rate of BPSs and indels with integrative reporter plasmids

Sequence analysis of the evolved strains in MA experiments indicated changes in the profile of mutations between the wild-type and ΔnucS strains. The ability of NucS to recognize and repair specific errors in vivo can be studied in detail through the design of genetic reporters containing each type of point mutation. These reporters allow us to measure the mutation rate of BPSs and indels selectively to establish the repair specificity of NucS inside the cells.

A set of integrative reporter plasmids with a modified aph gene marker (kanamycin phosphotransferase gene) was generated by site-directed mutagenesis to measure the rate of each specific mutation. While the intact aph marker conferred kanamycin resistance (KanR), the mutated aph versions were all kanamycin susceptible (KanS) (see Materials and Methods and the Supplementary Materials). Six reporter plasmids (named pMV-aph numbers 1 to 6) containing all six types of BPSs were constructed by introducing specific point mutations at the CAT codon (encoding the key residue histidine His132) in pMV361. Two plasmids, numbers 1 and 2, contained two different transitions, while the other four (numbers 3 to 6) have all four possible transversions. A scheme of the mutated plasmids with inactive aph and the resulting sequences with the amino acid changes are shown in Fig. 2.

Fig. 2 Assembly of reporter plasmids.

Integrative plasmid constructs used to measure KanR reversion events are shown. Modified bases are shown in green, and indels are shown in orange.

By kanamycin resistance reversion assays, we measured the rates of each base substitution mutation in the wild-type and nucS-defective strains (see Materials and Methods). All base substitutions occurred at a very low rate in the wild-type strain, most of them around 10−9 to 10−10 mutations per cell per generation, although, as expected, both transitions (A:T>G:C and G:C>A:T) were the most frequent mutations, while some others were especially rare or undetectable (e.g., A:T>T:A transversion) (Fig. 3 and table S3).

Fig. 3 Mutation rates for KanR reversion events.

Rates of spontaneous mutations per cell per generation conferring kanamycin resistance for M. smegmatis mc2 155 (wild type) (blue) and its ΔnucS derivative (red) are shown. Ninety-five percent confidence intervals are shown. The limit of detection was around 10−10, as inferred by observed mutation rates of A:T>T:A transversion.

Elimination of the nuclease NucS led to a significant increase in both types of transitions (16.94- and 12.28-fold for A:T>G:C and G:C>A:T, respectively) when compared to the wild-type mutation rates (Fig. 3 and table S3). A much lesser effect of NucS was detected in the mutation rates of transversions. All transversions showed no increase or a slight increase in mutation rates (less than two to three times) in the ∆nucS strain, keeping similar rates to the ones detected in a wild-type strain (Fig. 3). These results indicate that NucS is able to target and repair specifically the transitions that emerge in the genome, displaying a much lesser role on the repair of transversions. As a result, a high number of these two base substitutions arise and accumulate in a nucS-deficient strain. NucS removes a high proportion of transitions, according to the Kan-dependent BPS reporters (table S3).

In addition to the BPS reporters, a set of frameshift mutation reporters at the aph gene was designed to analyze the activity of NucS on different insertions/deletions. Three plasmid reporters were created with +1-, +2-, and +3-bp frameshifts in the CAT codon (His132). With these reporters, we were unable to detect any revertant, in agreement with previous reports that revealed that indels mainly arise at hotspots with runs of the same nucleotide or repeated sequences (19, 20).

We identified a potential frameshift hotspot in a run of 4 Gs in the aph gene (position 624–627). Two reporter plasmids (pMV-aph numbers 7 and 8) were made to analyze indels through the insertion or deletion of a G base, respectively, in this stretch of 4 Gs to generate a frameshift in the aph gene (Fig. 2). Likewise, only deletion or insertion of the proper base pair could reconstruct a functional gene in these frameshift-carrying reporters. When integrated in the M. smegmatis chromosome, each indel construction allowed us to measure the rate of a single base pair deletion or insertion (see Materials and Methods). Our data showed that both strains, wild type and ∆nucS, exhibited similar rates of 1-bp insertion/deletion (Fig. 3 and table S3). These results indicated that the noncanonical MMR pathway is unable to repair single base pair deletion or insertion in M. smegmatis.

Comparison of NucS versus MutSL repair activities

We have performed a comparative analysis of the in vivo repair capacities of eight different bacterial MMRs (two harboring NucS and six with MutSL), including M. smegmatis, as calculated from MA experiments (Table 3 and table S4). The higher activity on transitions than transversions is a general feature shared by canonical and noncanonical MMR systems, being especially relevant in M. smegmatis. M. smegmatis NucS has very poor activity on all types of transversions, particularly on A:T>C:G (only 2.87% are repaired) and G:C>T:A (35.25% are repaired), leading to a weak 1.7-fold increase when NucS was eliminated (Table 3 and table S4). C. glutamicum, another NucS-harboring bacterium, also exhibited very poor activity on all types of transversions (8, 21). In contrast, most MutSL-harboring species retained some activity, in most cases higher than 70% on transversions, with increases of 7 to 12 times when MutSL activity was eliminated. The case of Deinococcus radiodurans seems to be special as its ∆mutS-deficient derivative showed extremely low increase in mutation rate (only fourfold), suggesting that the low MMR activity may be compensated by other highly efficient alternative DNA repair pathways (21). In summary, M. smegmatis MMR function is similar to that of other bacterial MutSL-based MMR on transitions, although it displays clear unique features such as its inability to correct small indels and its exceptionally low activity on transversions.


Mutation is a major evolutionary force able to generate genetic variation, and it is the substrate for the complex process of evolution. However, most organisms keep very low mutation rates, probably as a result of selective pressure against the cost of deleterious mutations, reaching a finely tuned balance (2224). Our MA experimental data estimate a spontaneous mutation rate of 0.005 mutations per genome per generation for the wild-type M. smegmatis strain (7.58 × 10−10 per nucleotide per generation), similar to previous reports in mycobacteria (16) and other free-living bacteria (15, 21, 2527). The mutation rate detected here is close to the value of ~0.003 mutations per genome per generation proposed for all microbes by Drake (24). Our results thus indicate that a fine-tuned mutation rate could be an evolved trait also in mycobacteria, like in other prokaryotes (24).

Several DNA repair mechanisms remove mutations to reduce the mutation rate, especially highlighting the essential role of the MutS-MutL–based MMR pathway. Mycobacteria, including M. smegmatis, lack canonical MMR enzymes (MutS and MutL) (4, 6). Nevertheless, we and others have recently suggested that they have a noncanonical MMR pathway, whose key factor is the DNA repair protein NucS, with a different evolutionary origin and no structural homology to the canonical MMR enzymes MutS and MutL (4). In our MA experiment, the nucS-deficient strain accumulated a large number of mutations, displaying a mutation rate of ~0.16 per genome per generation or 2.38 × 10−8 per site per generation (>30-fold higher than that of the wild-type strain).

NucS mainly targets BPSs, but according to our results, it is unable to recognize and repair indels. Along the MA evolution, BPS rate reached 0.16 mutations per genome per generation in the ∆nucS strain, more than 40-fold higher than the one detected in the wild-type strain. The rate of indels was similar in both strains. Moreover, indel reporter plasmids confirmed that +1-bp insertion or −1-bp deletion occurs at very low rates with similar levels in both strains, reflecting the lack of repair activity of NucS on indels. This seems to be a unique characteristic of the mycobacterial MMR system, as most bacterial canonical MMRs are able to efficiently repair short indels (21).

When the type of BPSs accumulated along MA experiment was studied, a clear mutational bias for transitions was revealed in both wild-type and ∆nucS strains (A:T>G:C and G:C>A:T). The loss of NucS led to a notable ~65-fold increase in the transition rate, while the transversion rate remained almost unchanged (increasing less than twofold). The transition/transversion ratio changed from 1.65 in the wild-type to 62.8 in the nucS-deficient strain (38-fold higher). While NucS corrects transitions very efficiently, its activity on transversions is extremely poor. In this sense, canonical MMR-bearing bacteria tend to have lower activity on transversions than on transitions, as seen in previous MA experiments (Table 3 and table S4). However, a remarkable feature of the M. smegmatis NucS system, shared by NucS-bearing bacteria (see data from C. glutamicum in Table 3), is the extremely low activity on transversions and also the lack of activity on short indels.

The set of six BPSs reporter plasmids revealed which type of point mutations can be repaired by NucS. The study of the mutation rate of each specific mutation confirmed that both transitions (A:T>G:C and G:C>A:T) are the preferred mutations corrected by NucS, strongly increasing their rate when this repair protein is removed. No or weak increase was detected for all the four types of transversions in the nucS-deficient strain.

When the mutational spectrum was analyzed, a change in the mutational bias was detected along the MA evolution. While the wild-type strain accumulated G:C>A:T transitions, the mutational profile of the ∆nucS variant was dominated by A:T>G:C transitions. It suggests that, although NucS could repair very efficiently both types of mismatches, A:T>G:C is the predominant substrate corrected by the noncanonical MMR, as it becomes the most abundant mutation when NucS is absent. The mutational spectrum in the absence of the noncanonical MMR suggests that this system prevents further increases in the already G:C-rich mycobacterial genome, in agreement with previous reports that proposed a bias that drives genomes toward greater A:T content (15, 2835). Together, our data strongly support the role of NucS as a repair protein able to remove mutations and protect the integrity of the genome, confirming that the NucS-based noncanonical MMR is a specific DNA repair mechanism that corrects transitions in the mycobacterial genome.

A model for DNA repair mechanisms that cooperate to avoid and remove mutations can be proposed in mycobacteria. The mycobacterial DNA polymerase DnaE1 has a very accurate and faithful activity during replication, with a very low error rate (17). DNA polymerase DnaE1 contains a PHP domain with 3′-5′-exonuclease activity, able to correct very efficiently both base pair mutations and indels during DNA replication. The absence of PHP exonuclease activity led to 7 to 11 mutations per genome per generation, a 2300- to 3700-fold increase over the wild type in a M. smegmatis mutant accumulation assay (17). While short indels can be almost suppressed by the DNA polymerase exonuclease activity, base pair mutations require an additional correction mechanism: the NucS-based noncanonical MMR. NucS is able to remove the remaining base pair mutations (mainly transitions) to reach wild-type basal mutation rates. Moreover, it has been shown that mismatches inhibit DNA polymerase activity in mycobacteria (17, 36). Hence, NucS activity could also be required for the replisome to facilitate DNA polymerization. Further studies are needed to shed light on the interplay between the DNA polymerase and the mismatch-specific nuclease NucS at the molecular level.

Our results provide an exceptional view of how the noncanonical MMR pathway works in vivo to maintain the low mutation rate and GC content of mycobacterial genome. Notably, despite the absence of structural homology and the different evolutionary origin (4), this mycobacterial noncanonical MMR system displays function and bias similar to the canonical MMR, in terms of BPSs (transitions), in other organisms. This reveals a clear example of functional evolutionary convergence of two nonrelated MMR pathways toward similar repair functions and highlights the importance of MMR for the stability of genomes. However, the similarities and differences (such as the inability to correct small indels and its very poor activity on transversions) between NucS and canonical MMR systems reveal that this alternative DNA repair pathway has its own identity and unique properties.

Last, hypermutable bacterial pathogens, very often associated with defects in MMR components, are frequently isolated and pose a serious risk in many clinical infections (37, 11). In addition, the existence of hypermutable M. tuberculosis strains deficient in NucS activity has been suggested (4). Because M. tuberculosis acquires antibiotic resistance exclusively by mutations, the knowledge of mechanisms that influence the generation of mutations may allow the development of new strategies to predict and combat antibiotic resistance in this deadly pathogen.


Bacterial strains and media

The M. smegmatis wild-type reference strain mc2 155 (American Type Culture Collection, 700084) and a noncanonical MMR strain (∆nucS) were used in this study. The noncanonical MMR strain is a derivative of M. smegmatis mc2 155 with an in-frame deletion of the target gene nucS (MSMEG_4923) (4). M. smegmatis strains were grown in Middlebrook 7H9 or 7H10 with 0.5% glycerol, 0.05% Tween 80, and 10% OADC (Oleic Albumin Dextrose Catalase) enrichment.

MA experiments

M. smegmatis MA independent lines were evolved in parallel starting from the parental strains, wild-type mc2 155, and its noncanonical MMR-deficient (∆nucS) derivative. Each strain generated a total number of 11 MA lines able to evolve independently for 350 days (50 weeks). Evolution of the MA lines allowed the accumulation of spontaneous mutations at random in their bacterial genomes. The evolution of each line was performed on Middlebrook 7H10 solid agar to generate isolated colonies. Every week, a well-isolated single colony (the farthest from the inoculum site) from each MA line was transferred by streaking to a new plate. Each line passed through a total number of ~1300 to 1400 cell divisions (with an average number of generations per passage estimated to be ~26 to 27 cell divisions). The number of generations per passage was estimated by measuring the number of viable cells in wild-type and ∆nucS colonies of different size, assuming that each colony was generated by a single cell. Colonies were excised from the agar plates, resuspended in saline solution (0.9% NaCl supplemented with 0.05% Tween 80), and plated on 7H10 petri dishes by serial dilutions. In this way, by measuring the diameter of colonies from MA lines, we could determine the number of cells per colony. The number of generations (n) was then calculated by n = log2N, with N being the number of cells per colony. MA lines were incubated at 37°C under aerobic conditions. When all the evolved MA lineages reached 50 weeks of growth in solid medium without selective pressure, all lineages were saved and stored at −80°C. The procedure used for this experiment ensures that each line had a single-cell bottleneck in every passage to fresh medium. Mutations accumulated in an effectively neutral fashion minimizing selective pressure and bias.

Genomic DNA preparation and whole-genome resequencing analysis

The parental strains and the evolved lines (11 wild-type–derived strains and 11 ΔnucS-derived strains) were analyzed by whole-genome sequencing. DNA was extracted using the standard protocol for a high-quality preparation of mycobacterial genomic DNA (38). DNA concentration and purity were measured using a Qubit 3.0 fluorometer (Life Technologies) and a NanoDrop-2000 spectrophotometer (Thermo Fisher Scientific, Wilmington, DE, USA). Whole-genome sequencing libraries were constructed with the Nextera XT DNA Library Preparation Kit (Illumina, San Diego, CA) following the manufacturer’s instructions. Sequencing was performed on the Illumina MiSeq instrument using a MiSeq V3 sequencing kit to obtain 300-bp paired-end reads. The obtained Illumina reads were filtered with the fastp tool to remove low-quality bases. After that, the filtered reads were aligned to the reference genome (M. smegmatis reference genome sequence GenBank NC_018289.1) with the short-read alignment tool BWA. Potential duplicates were removed with Picard tools. The variant calling was performed with VarScan. A single-nucleotide polymorphism (SNP) was called if (i) we have at least 20 reads supporting the genomic position, and (ii) it was found at a frequency of 0.9 or higher and (iii) was not found near an indel region (10 bp) or in regions of high density of SNPs (>3 SNPs in 10-bp window). Indels were called and filtered (Fisher Strand > 200, Strand Odds Ratio > 3, RMSMappingQuality < 40, ReadPosRankSum←20, and Depth Coverage < 20) with Genome Analysis Toolkit (GATK). Single variants and indels were annotated by using SnpEFF. The average of sequence coverage per line corresponded to >99.9% of the M. smegmatis reference genome. Mutations were identified according to the detected changes in the DNA sequence between the evolved strains and the parental strain in each case. BPSs were annotated with the type of BPS and the position in the reference genome. BPSs were determined to affect coding DNA (synonymous or nonsynonymous BPSs) or intergenic noncoding DNA.

Reporter plasmid design and construction

A set of six BPS reporter plasmids was constructed by introducing base pair changes: two transitions (A:T>G:C and G:C>A:T) and four transversions (G:C>T:A, G:C>C:G, A:T>C:G, and A:T>T:A). All point mutations were designed and constructed at position 394–396 bp of the aph gene, conferring kanamycin resistance (KanR) in the pMV361 integrative plasmid (39), to inactivate the KanR phenotype. In addition, two frameshift reporter plasmids were made by the insertion or deletion of 1 bp (a G nucleotide) at position 624–627 bp in a stretch of GGGG in the aph gene. The susceptibility phenotype to kanamycin of the resulting plasmids was analyzed by calculating the minimal inhibitory concentration (MIC) of kanamycin in broth. A hygromycin resistant (HygR) gene was cloned in the reporter vectors to generate a functional antibiotic marker to select plasmid transformation and integration. All the changes in the sequence were carried out by site-directed mutagenesis using the primers listed in table S5.

Principles of the design and evaluation of reporter plasmids

When a pMV361 plasmid with an intact aph gene was introduced in mycobacteria, cells produce an active enzyme able to modify and inactivate this antibiotic by phosphorylation, conferring kanamycin resistance. However, when an essential histidine at position 132 (His132) was replaced by any other amino acid, the resulting enzyme lost its activity. Mutated aph gene was inactive and unable to confer KanR when it was inserted in the M. smegmatis chromosome.

The set of plasmids with specific BPSs or indels was integrated in the M. smegmatis chromosome at the unique att site in both wild-type and ∆nucS strains and selected using hygromycin. The MIC of kanamycin for M. smegmatis strains that carried the integrated plasmids was very low (MIC, 2 μg/ml). By contrast, the intact aph gene conferred a high level of kanamycin resistance (MIC, 128 μg/ml) in both wild-type and its ∆nucS derivative. All the aph mutated alleles therefore are inactive in mycobacteria as the result of a single specific BPS or indel in each reporter plasmid.

The KanR phenotype can be recovered exclusively by a specific BPS that restores a functional wild-type aph gene affecting only the previously modified base. When integrated in the mycobacterial chromosome, each base substitution and indel construction allowed us to measure the rate of each spontaneous point mutation at this specific site. The aph reversion system is capable of detecting very low rates of each type of mutation.

We have verified by sequencing that the KanR colonies were generated by reversion at a specific point previously modified by site-specific mutagenesis and not by additional chromosomal mutations. High-level KanR-conferring mutations result by modification of the 16S ribosomal RNA (rRNA), a structural component of the ribosome, through rrn gene mutations in mycobacteria. However, M. smegmatis harbors two copies of the rrn gene (rrnA and rrnB), and it is not able to acquire high-level KanR by a single-allele rrn mutation. As a result, high-level KanR (50 μg/ml) cannot be reached by spontaneous mutation at a detectable level (mutation rate, ≤1 × 10−10).

Reversion assay and estimation of mutation rates

Reporter plasmids containing BPSs or indels were transformed in the mc2 155 wild-type and ∆nucS strains by electroporation. All the constructions were integrated at the unique att site in the M. smegmatis genome and selected using hygromycin (50 μg/ml). The analysis of the mutation rates of each specific mutation was developed by a reversion assay, measuring the acquisition of KanR by the emergence of each specific type of mutation able to restore a functional wild-type aph gene. The reversion of each mutation was verified by polymerase chain reaction amplification and sequencing of the aph gene from five independent colonies from each test. We previously verified that no KanR colonies were generated by mutations in the absence of the aph gene by plating cultures of the strains mc2 155 wild type and ∆nucS harboring plasmid pMV361-Hyg-R on plates with kanamycin (see also the Supplementary Materials).

Mutation rates were experimentally addressed using Luria-Delbrück fluctuation assays. For each specific mutation, 16 to 24 independent cultures derived from mc2 155 and ∆nucS strains (carrying the proper reporter plasmid) were grown on 10 ml of 7H9 broth and plated in 7H10 solid medium with kanamycin (50 μg/ml) to determine the number of mutants. Viable cells were estimated by plating serial dilutions in 7H10 solid medium without any antibiotic, and plates were incubated for 3 to 5 days at 37°C. Once viable cells and reversion mutants arose, colony counts were carried out to obtain the number of viable cells and mutants for each type of mutation. The distribution of mutant counts between parallel cultures was used to calculate mutation rates. The estimated number of mutations per culture (m) and 95% confidence intervals were calculated using the maximum likelihood estimator applying the newton.LD.plating, newton.LD, confint.LD.plating, and confint.LD functions (plating functions account for differences in plating efficiency) implemented in the package rSalvador ( (Q. Zheng, Rsalvador: An assay. R package version 1.7). Last, mutation rates were calculated by dividing m by the total number of generations, assumed to be roughly equal to the average final number of cells. After calculating mutation rates and 95% confidence intervals, a comparative analysis was carried out between wild-type (mc2 155) and NucS-lacking (ΔnucS) strains.


Supplementary material for this article is available at

Table S1. Genomic context of BPSs.

Table S2. Distribution of BPSs in the replicores.

Table S3. Mutation rates and 95% confidence limits for KanR reversion events.

Table S4. Efficiency of BPS and indel repair of different bacterial MMRs obtained from MA experiments.

Table S5. Oligonucleotides used to construct BPSs and indel reporter plasmids.

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank J. Rodríguez-Beltrán and J. Ramón Valverde for helping us with statistics and J. L. Martínez for useful comments on the manuscript. Funding: This work was supported by Plan Nacional de I + D + i 2013–2016 and the Instituto de Salud Carlos III, Subdireccion General de Redes y Centros de Investigacion Cooperativa, Ministerio de Economia, Industria y Competitividad, Spanish Network for Research in Infectious Diseases (REIPI RD16/0016/0009)—cofinanced by the European Development Regional Fund “A Way to Achieve Europe”—by Operative Program Intelligent Growth 2014-2020 and grants FIS PI17/00159 from the Instituto de Salud Carlos III and SAF2015-72793-EXP from the Spanish Ministry of Science and Competitiveness (MINECO)-FEDER. E.C.-S. was the recipient of a PFIS predoctoral research fellowship (FI18/00036) cofinanced by the Instituto de Salud Carlos III and the European Social Fund. I.C., A.C.-O., and M.T.-P. are funded by projects of the 581 European Research Council (ERC) (638553-TB-ACCELERATE) and Ministerio de Economía y Competitividad research grant SAF2016-77346-R. Author contributions: A.C.-G. designed the project, designed and performed experiments, made M. smegmatis strains, performed MA sequence analysis, and wrote the manuscript. J.B. designed the project, directed the experimental work, obtained funds, and wrote the manuscript. I.M.-B. constructed reporter plasmids, measured mutation rates, and performed statistical analysis. E.C.-S. performed part of the MA experiments and statistical analysis. I.C., M.T.-P., and A.C.-O. performed sequencing and MA sequence analysis. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.

Stay Connected to Science Advances

Navigate This Article