Saturation of recognition elements blocks evolution of new tRNA identities

See allHide authors and affiliations

Science Advances  29 Apr 2016:
Vol. 2, no. 4, e1501860
DOI: 10.1126/sciadv.1501860


Understanding the principles that led to the current complexity of the genetic code is a central question in evolution. Expansion of the genetic code required the selection of new transfer RNAs (tRNAs) with specific recognition signals that allowed them to be matured, modified, aminoacylated, and processed by the ribosome without compromising the fidelity or efficiency of protein synthesis. We show that saturation of recognition signals blocks the emergence of new tRNA identities and that the rate of nucleotide substitutions in tRNAs is higher in species with fewer tRNA genes. We propose that the growth of the genetic code stalled because a limit was reached in the number of identity elements that can be effectively used in the tRNA structure.

  • tRNA identities
  • genetic code
  • evolution
  • ADAT
  • tRNA Gly


The expansion of the genetic code requires the evolution of new transfer RNA (tRNA) identities cognate to previously nonproteogenic amino acids. A major challenge in synthetic biology is the development of orthologous aminoacylation systems that efficiently integrate new tRNAs into established translation machinery. Similarly, the expansion of the genetic code during early steps in the origin of life required the emergence of new tRNA isoacceptors that allowed the system to evolve from a simple error-prone protein synthesis to a 20–amino acid code that translates genes with an error rate of approximately 1 × 10−4 (13). An important question in this regard is why the universal genetic code did not grow beyond 20 amino acids.

We have investigated the emergence of a family of eukaryote-specific tRNA identity elements required for the modification of tRNAs by a tRNA modification enzyme. Heterodimeric adenosine deaminases acting on tRNAs (ADAT) are eukaryotic-specific enzymes that catalyze the deamination of adenosine 34 (A34) to inosine 34 (I34) at the first position of tRNA anticodons. The presence of I34 enables a single tRNA anticodon to recognize codons ending with C, U, and A. In bacteria, the activity of the ortholog of ADAT [tRNA-specific adenosine deaminase (TadA)] is limited to tRNAArg. In eukaryotes, TadA underwent a process of gene duplication and divergence to give rise to ADAT, whose substrate range is increased to include all A34-containing tRNAs acting in three-, four-, and six-box codon sets, with the sole exception of tRNAGly (4).


We have previously shown that the duplication and subsequent divergence that turned the homodimeric bacterial TadA into the heterodimeric eukaryotic ADAT contributed to the evolution of eukaryotic genomes in terms of both tRNA gene content and codon usage bias (5). In particular, the emergence of ADAT in eukaryotes correlates with a marked enrichment of previously inexistent A34-containing tRNAs and the loss of their synonymous G34-containing tRNA isoacceptors (Fig. 1) (4, 5). Surprisingly, tRNAGly did not follow this evolutionary trend. In contrast to ADAT substrates, tRNAGlyACC is absent in eukaryotic genomes, and tRNAGlyGCC remains the most abundant tRNAGly isoacceptor (Fig. 1).

Fig. 1 tRNAGlyACC remains absent in eukaryotes.

Schematic representation of the frequency of tRNA genes in three-, four-, and six-codon boxes of the genetic code, depending on their N34 sequence and for the three different domains of life. In archaea, all isoacceptors are equally represented, except for A34-containing tRNAs, which are mostly absent. In bacteria, A34-containing tRNAArg, where TadA is present, is overrepresented. In eukarya, where ADAT is present, A34-containing tRNAs are overrepresented, except for tRNAGlyACC, which remains completely absent. Each residue at position 34 of the anticodon is colored according to its tRNA gene abundance.

We investigated the reasons for the intriguing lack of tRNAGlyACC in eukaryotic genomes. First, we asked whether a fundamental functional defect could prevent tRNAGlyACC from being used in eukaryotes. Borén et al. (6) had previously shown that a mutated Escherichia coli tRNAGlyACC is active in elongation. To extend this analysis to eukaryotes, we modified the anticodon of the genomically predominant human tRNAGly isoacceptor (tRNAGlyGCC) to ACC and found that it can be effectively charged by glycyl-tRNA synthetase (GlyRS) (Fig. 2A). Further, we established that the I34 modification does not impair the aminoacylation of tRNAGly (Fig. 2A). These results are in agreement with the observation that GlyRS interactions at position 34 are not sequence-specific (7). Thus, the absence of tRNAGlyACC in eukaryotes is not due to a fundamental structural or functional defect hampering its function in translation.

Fig. 2 tRNAGlyACC can be aminoacylated by GlyRS but not deaminated by ADAT.

(A) A34- and I34-containing tRNAGly are efficiently aminoacylated by human GlyRS. tRNAGlyICC + tRNAGlyACC (dashed line) are the tRNAs remaining from the partial deamination of tRNAGlyACC by ADAT. The proportion of tRNAGlyICC and tRNAGlyACC was quantified: 42.5% of tRNAGlyICC and 57.5% of tRNAGlyACC. (B) tRNAAlaAGC is completely deaminated by ADAT after 15 min of incubation. tRNAGlyACC is only partially modified after 60 min of incubation. (C) Confirmation by sequencing of A to I deamination. Deaminated tRNAAlaAGC displays a G34 instead of an A34, indicating full I34 modification. Partial A to I deamination can be observed for tRNAGlyACC. A tRNAAla with a GlyACC anticodon loop displays an A34, indicating no I34 deamination. (D) Comparison of human anticodon loop sequences for each ADAT substrate and tRNAGly.

We then focused our attention on the potential deamination by ADAT of a tRNAGlyACC, an activity that is dependent on tRNA structure (8). Through electrophoretic mobility shift assays (EMSAs) (fig. S1A), we first showed that human tRNAGly can be bound by ADAT independently of the sequence at position 34. We then used inosine-specific cleavage (9, 10) (fig. S2A) and tRNA sequencing to detect the deamination of tRNAGly by ADAT (11, 12). Human tRNAAla, a natural substrate of ADAT, was chosen as a control because of its high sequence similarity to tRNAGly.

A control tRNA is completely modified by ADAT in 15 min, but tRNAGlyACC is only partially deaminated after 60 min under the same conditions (Fig. 2, B and C). Similar results were obtained with tRNAGly derived from the other human tRNAGly isoacceptors or with in vivo–transcribed tRNAGly (fig. S2, B and C). Thus, tRNAGly is an extremely poor ADAT substrate.

We investigated the reasons for the low activity of ADAT with tRNAGly. Chimeric tRNAs based on combinations of human tRNAAla and tRNAGly were tested for deamination by ADAT and unequivocally showed that the anticodon loop of tRNAGly prevents deamination, but not binding, by the enzyme (Fig. 2C and figs. S1B and S2D). Sequence alignments of tRNA substrates of ADAT with tRNAGly revealed that the anticodon loop of tRNAGlyACC differs from tRNAAlaAGC only at positions 32, 35, and 38 (Fig. 2D). We tested the ability of these positions to block ADAT activity by individually mutating each base in the anticodon of tRNAAla to the sequence of the tRNAGly anticodon loop. Single point mutations did not significantly affect ADAT activity in tRNAAla (fig. S2D).

We then searched among human tRNAs for the ADAT substrate with the closest anticodon loop sequence to tRNAGly. We find that the tRNAValAAC anticodon loop differs from tRNAGlyACC only at position 35 (C35 in tRNAGlyACC and A35 in tRNAValAAC) (Fig. 2D). When an A35C mutation was introduced in the context of a tRNAValAAC, this significantly reduced ADAT activity for this tRNA (fig. S2E). Thus, a cytidine in position 35 of tRNAAla does not significantly affect the activity of ADAT, but the same base prevents ADAT activity in the context of tRNAVal. Similarly, the presence of A38 favors deamination of tRNAGly by ADAT, even though the sequence at position 38 is variable among ADAT substrates (Fig. 2D).

Our results indicate that the recognition of tRNA anticodons by ADAT is structure-dependent and that the inability of the enzyme to modify tRNAGly is not due to a specific sequence but rather to the structural context of this anticodon. To check this hypothesis, we performed state-of-the-art molecular dynamics (MD) simulations to analyze the relationship between sequence, structure, and motion in the anticodons of ADAT substrates and tRNAGly.

We first analyzed through 1 μs (109 integrations of Newton equations of motion) a control tRNA for which a crystallographic structure is available (tRNAPheCUC) and found that the force field can reproduce the known structure. We then studied tRNAGlyGCC and tRNAGlyCCC (built from tRNAPhe), which displayed full conservation of the structural signature of the anticodon loop obtained for tRNAPhe (Fig. 3 and figs. S3 and S4).

Fig. 3 A34 destabilizes tRNAGly anticodon.

(A) Initial structure used to build all the models analyzed by MD. The figure corresponds to the anticodon stem-loop of tRNAPhe [Protein Data Bank (PDB) ID 2TRA] (residues 27 to 43). (B) Final conformations of the studied anticodons after 1 μs of unbiased MD. The two top figures correspond to the final structures obtained for the tRNAGly isoacceptors (with G34 and C34). The two middle structures illustrate the effect of introducing A at position 34 in tRNAGly anticodons. Red arrows point to the more evident distortions. The two bottom structures show that presence of A at position 34 does not affect the stability of tRNAArg and tRNAAla anticodons. (C) Molecular superposition of the initial and final structures, obtained by aligning the phosphate groups. The superimposition above shows the similarity between the initial and final conformations of the simulation for the stable structures. Note the correct orientation of the anticodon loop triplet (residues 34 to 36). The superposition at the bottom highlights the magnitude of the structural distortion in the region of the loop, induced by the presence of A34 in the context of tRNAGly isoacceptors.

The addition of A at position 34 in the two tRNAGly scaffolds previously shown to be stable caused a strong structural distortion of the anticodon (Fig. 3 and figs. S5 and S6), indicating that, in the context of the anticodon stem-loop of tRNAGly, the addition of A34 induces an important structural instability.

We then used the same approach to analyze the structural influence of A34 in the context of C35 or C36 (the ACC glycine anticodon) to determine whether the destabilizing effect of A34 seen in tRNAGly also appeared in other tRNAs. We in silico–constructed tRNAArgACG and showed that an A34-C35–containing anticodon remains stable during the entire microsecond of unbiased simulation (Fig. 3 and fig. S7). Similarly, simulations with tRNAAlaAGC showed that a fully stable structure was obtained with an A34-C36–containing anticodon (Fig. 3 and fig. S8).

Altogether, these results point to a sequence-independent role of positions 32, 35, and 38 in the rejection of tRNAGly by ADAT. Previous reports demonstrated the importance of positions 32, 35, 36, and 38 of tRNAGly for aminoacylation, modification, and accommodation in the ribosomal A-site (1316). On the other hand, Auxilien et al. (8) demonstrated the importance of the anticodon conformation for ADAT activity. We find that the structural features of the tRNAGly anticodon loop are incompatible with the presence of an adenosine at position 34, explaining why an A34-containing tRNA could not evolve and become enriched in eukaryotes after the emergence of ADAT in these species.

As shown, this distortion is not an impediment for aminoacylation. Additionally, the tRNA modification patterns in the anticodon stem-loop of ADAT substrates and of tRNAGly reveal that the former incorporate more modifications than the latter (fig. S9), again pointing to important structural differences between the structure of the tRNAGly anticodon and the same region in ADAT substrates.

We wondered whether the effect of signal saturation on tRNA evolution could be detected at the genomic level. We searched the genomic databases for pairs of related phylogenetic clades with different tRNA gene populations to test whether the rate of nucleotide substitution in tRNA sequences was higher in species with fewer tRNA genes. Indeed, we observed that bacterial species of the Ehrlichia genus (which contain 36 tRNA genes in their genome) have significantly more nucleotide differences between their orthologous tRNA pairs than species of the Escherichia and Salmonella genera, which have a larger number of tRNA genes (89 and 86 tRNA genes, respectively) than Ehrlichia (Fig. 4), a result fully consistent with our hypothesis.

Fig. 4 tRNA sequence divergence is higher in the genome with reduced number of tRNA genes.

Ehrlichia ruminantium str. Welgevonden and Ehrlichia canis str. Jake tRNA genes have significantly more nucleotide differences between their orthologous pairs than the orthologous tRNA pairs between Salmonella enterica subspecies I, serovar Typhimurium strain LT2 (Salmonella typhimurium str. LT2) versus E. coli CFT073. The divergence time of the two pairs of species was predicted to be similar based on 16S rRNA divergence (40- and 37-nt differences for each pair, respectively). See also table S1.

Signal saturation may also limit the evolution of genomic tRNA populations in extant species. If preexisting recognition signals block the evolution of new tRNA identities, then one would expect that tRNA sequences would evolve faster in genomes with smaller numbers of tRNA genes. An example of this situation can be found in mitochondria, whose genomes have low numbers of tRNA genes (17). This reduction in tRNA genes correlates well with the frequent variations in the genetic code in these organelles and with the variability in identity elements and aberrant structures of tRNAs. For example, universally conserved identity elements in cytosolic tRNAAsp (G73), and tRNAAla (G3/U70) are lost in animal mitochondrial tRNAs (18, 19).


Our proposal predicts that systems with saturated tRNA signals would need to evolve different incorporation machineries to add new amino acids to their genetic code. The case of pyrrolysine would seem to contradict this prediction, but pyrrolysine displays a very limited distribution in the phylogenetic tree (20) in (almost exclusively) species that contain very simplified tRNA populations (5). Selenocysteine (Sec), on the other hand, is a proteogenic amino acid that is almost universally incorporated to proteins but not through direct aminoacylation of a cognate tRNA. Instead, Sec incorporation requires an alternative mechanism that depends on the use of a serylated tRNA and a complex set of dedicated components (21). Determining the exact functional constraints that limit the growth of the genetic code would be useful for the design of orthologous tRNA–aminoacyl-tRNA synthetase (ARS) pairs to expand the proteome composition in engineered organisms.

The nucleotide sequence of a canonical tRNA is composed of 76 bases, of which at least 23 positions present completely invariant or semi-invariant nucleotides (2224). Many of these invariant nucleotides are kept fixed to maintain the tRNA shape required by elongation factors and the ribosome. Of the remaining 53 positions, several are required as idiosyncratic identity elements for recognition by ARSs (13), and an undetermined number of positions are important as negative identity elements against noncognate synthetases and as recognition elements for processing, modification, transport, and other functions. On the basis of the biochemical, structural, and genomic data presented here, we propose that the accumulation of structural and functional constraints that operate on tRNAs acts as an operational limit that impedes the evolution of new tRNA identities and, as a result, the incorporation of new amino acids into the genetic code.


Experimental design

To investigate the reasons for the absence of tRNAGlyACC in eukaryotic genomes, we first tested the hypothesis that functional defects could prevent this tRNA from being used. tRNAGlyACC and the other tRNAs mentioned in this paper were transcribed, in vivo or in vitro, and tested for aminoacylation by GlyRS and deamination by ADAT. We then conducted MD simulations to analyze the relationship between sequence, structure, and motion in the anticodons of ADAT substrates and tRNAGly. Finally, we studied the rate of nucleotide substitution in tRNA sequences from phylogenetic clades with different tRNA gene populations (Escherichia, Salmonella, and Ehrlichia genera) to test whether this rate was higher in species with fewer tRNA genes.

Enzyme cloning, expression, and purification

Human cytoplasmic alanyl-tRNA synthetase (AlaRS) and GlyRS complementary DNAs (cDNAs) (NM_001605 and NM_002047, respectively) were cloned into a pQE-70 vector (Qiagen) for bacterial expression of a C-terminal His6-tagged protein. The correct sequence of the cDNA was confirmed by sequencing. BL21 (DE3) cells (Life Technologies), transformed with pQE70-AlaRS or pQE70-GlyRS, were grown at 37°C up to an OD600 (optical density at 600 nm) of 0.6. Protein expression was then induced with 1 mM isopropyl-β-d-thiogalactopyranoside (IPTG) for 3 hours. Recombinant ARS activity in crude cell extracts was assayed by aminoacylation assay. Human GlyRS enzyme was purified on nickel affinity columns according to the manufacturer’s protocol (GE Healthcare Life Sciences). Protein concentration was determined by NanoDrop ND-1000 [A280; molar extinction coefficient (M−1 cm−1) and molecular weight for their respective protein].

Human ADAT2 and ADAT3 cDNAs (NM_182503 and NM_138422, respectively), synthesized by GenScript, were cloned, expressed, and purified by the Institute for Research in Biomedicine (IRB) Protein Expression Facility (N. Berrow, IRB Barcelona). Human ADAT2 and ADAT3 cDNAs were cloned into pOPINFS expression vectors (Oxford Protein Production Facility) as a bicistronic product, resulting in ADAT2 N-terminal His6 tag and ADAT3 C-terminal Strep II tag fusion proteins separated by a ribosome binding site. The construct was expressed in E. coli Rosetta strain grown in Overnight Express Instant TB Medium (Novagen). The heterodimer (ADAT2 and ADAT3) was purified by affinity chromatography with a StrepTrap HP column (GE Healthcare). ADAT2 and ADAT3 were purified at an equimolar ratio. The purified heterodimer ADAT was stored in 50 mM Hepes (pH 8), 100 mM KCl, 1 mM MgCl2, 0.1 mM EDTA, and 2 mM dithiothreitol (DTT) with 20% glycerol.

tRNA sequences

The original tRNA sequences were from the Genomic tRNA Database (25) (

tRNAGlyGCC (Homo_sapiens_Gly-GCC-2-1):


tRNAGlyACC (GCC) (G34A mutation):


tRNAGlyCCC (Homo_sapiens_Gly-CCC-2-1):


tRNAGlyACC (CCC) (C34A mutation):


tRNAGlyUCC (Homo_sapiens_Gly-TCC-2-1):


tRNAGlyACC (UCC) (U34A mutation):


tRNAAlaAGC (Homo_sapiens_Ala-AGC-2-1):


tRNAAlaACC (G35C mutation):


tRNAAlaAGC (U32C; U38C mutation):


tRNAAla anticodon stem-loop Gly ACC:


tRNAAla anticodon loop Gly ACC:


tRNAAla anticodon stem Gly:


tRNAValAAC (Homo_sapiens_Val-AAC-1-1):


tRNAValACC (A35C mutation):


In vitro tRNA preparation

Wild-type or mutated tRNAs were assembled using six DNA oligonucleotides that were first annealed, and then ligated, between Hind III and Bam HI restriction sites of plasmid pUC19. Bst NI–linearized plasmids were used to perform the in vitro transcription with T7 RNA polymerase, according to standard protocols (26). Transcripts were separated by 8 M urea/10% polyacrylamide gel electrophoresis. The tRNA was identified by ultraviolet (UV) shadowing, electroeluted, and ethanol-precipitated. The tRNA pellet was resuspended in H2O and stored at −20°C.

In vivo tRNA preparation

E. coli BL21(DE3) cells expressing the plasmid pUC19-tRNAGly or pUC19-tRNAVal, coding for human tRNAGlyACC or tRNAValAAC, were cultured in LB broth supplemented with ampicillin (100 mg/ml) at 37°C to a cell density of OD600 = 0.6. Expression of tRNA was induced with 1 mM IPTG for 4 hours. Bacterial pellets were resuspended in 0.3 M NaOAc (pH 4.3) and 10 mM EDTA and extracted with water-saturated phenol (pH 4.8). The aqueous fractions were ethanol-precipitated, resuspended in FDM loading buffer (95% formamide, 20 mM EDTA, 0.05% bromophenol blue, and 0.05% xylene cyanol), and resolved using 8 M urea/10% polyacrylamide gel electrophoresis. The tRNA was identified by UV shadowing, electroeluted, and ethanol-precipitated. The tRNA pellet was resuspended in H2O and stored at −20°C.

Aminoacylation assays and tRNA binding experiment

Aminoacylation of tRNA was performed at 37°C in 100 mM Hepes (pH 7.2), 25 μM amino acid, 30 mM MgCl2, 30 mM KCl, 0.5 mM DTT, 5 mM ATP, bovine serum albumin (0.1 mg/ml), [2-3H]-glycine (500 Ci/mol) or l-[3-3H] alanine (500 Ci/mol) (PerkinElmer), and 5 μM tRNA transcripts (refolded at 85°C followed by gradual reduction of temperature). Reactions were initiated by the addition of a crude cell extract or 1 μM pure human GlyRS. Samples (22 μl) were spotted onto Whatman 3MM discs at varying time intervals. Radioactivity was measured by liquid scintillation.

For EMSA, dephosphorylated tRNAs were labeled with [γ-32P]ATP (PerkinElmer) using T4 polynucleotide kinase, refolded at 70°C followed by gradual reduction of temperature, and incubated for 15 min at 4°C with 50 mM Hepes (pH 7.5), 1 mM MgCl2, 50 mM KCl, 0.5 mM DTT, 2% (w/v) glycerol, 2.5 μM oligo(dT)25, and 0, 0.5, 1.5, or 3 μM ADAT. Reactions were separated by electrophoresis onto a 10% polyacrylamide gel (19:1) in 0.5× tris-borate EDTA. The signal was digitalized using a PhosphorImager from a gel-exposed storage phosphor screen.

In vitro deamination assays

For the deamination assays, dephosphorylated tRNAs were labeled with [γ-32P]ATP (PerkinElmer) using T4 polynucleotide kinase, refolded at 70°C followed by gradual reduction of temperature in the presence of 2 mM MgCl2, and incubated for 1 hour at 37°C with 4.2 μM purified ADAT in 50 mM tris-HCl (pH 8.0), 25 mM KCl, 2.5 mM MgSO4, 0.1 mM EDTA (pH 8.0), and 2 mM DTT. Reactions were purified with the Viogene miTotal RNA mini kit (catalog number VTR1002), and tRNA was eluted from the purification column with diethyl pyrocarbonate (DEPC)–treated water.

The nonradioactive tRNAs used for the aminoacylation assay or for the reverse transcription polymerase chain reaction (RT-PCR) were prepared as follows: 20 μg of in vitro–transcribed tRNA was refolded at 70°C followed by gradual reduction of temperature in the presence of 2 mM MgCl2 and incubated at 37°C with 4.2 μM purified ADAT in 50 mM tris-HCl (pH 8.0), 25 mM KCl, 2.5 mM MgSO4, 0.1 mM EDTA (pH 8.0), and 2 mM DTT. After 90 min, 30 μl of purified ADAT was readded to the sample to reach a concentration of 5.6 μM. This last step was repeated to reach a final concentration of 6.3 μM ADAT and re-incubated for 90 min. Reactions were purified with the Viogene miTotal RNA mini kit (catalog number VTR1002), and tRNA was eluted from the purification column with DEPC-treated water. The efficiency of the deamination reaction was measured by the inosine-specific cleavage method after dephosphorylation and the radiolabeling of a small fraction of the treated tRNAGlyACC.

Inosine-specific cleavage

The inosine-specific cleavage treatment of synthetic tRNA was adapted from the treatment of synthetic RNA described previously (9). Glyoxal reactions (100 μl) contained 5 μg of tRNA E. coli (5 μg/μl; Sigma) as carrier, 0.2 pmol of 32P-labeled tRNA previously incubated with ADAT, 10 mM sodium phosphate (pH 7), 50% dimethyl sulfoxide, and 0.6% deionized glyoxal. After 45 min at 37°C, 100 μl of 1 M sodium borate (pH 7.5) was added, and the reaction products were precipitated with 500 μl of ethanol and dissolved in 15 μl of 10 mM tris/1 M sodium borate (pH 7.5) (at 37°C for 20 min). A ribonuclease T1 (0.5 μl; Fermentas EN0541) solution (1000 U/μl) was then added, and the mixture was incubated at 37°C for 30 min. Reactions were stopped by adding proteinase K to 300 ng/μl and incubating at 37°C for an additional 20 min. The reaction was purified with the Viogene miTotal RNA mini kit (catalog number VTR1002), and tRNA was eluted from the purification column with DEPC-treated water. Cleavage of synthetic tRNA was monitored by electrophoresis on a 10% polyacrylamide gel. The inosine-specific cleavage reaction for the in vivo–transcribed tRNAs was performed as above except that the tRNAs were not 32P-labeled but were detected by Northern blot with a 32P-labeled specific probe. The gel images were analyzed and quantified with ImageJ software.

RT-PCR and sequencing

Two micrograms of tRNA (previously incubated with or without ADAT) were retrotranscribed using the High-Capacity cDNA Reverse Transcription Kit (part number 4368814) following the manufacturer’s protocol. The primer used for the RT is oligonucleotide number 6, which is used for the in vitro tRNA preparation of the corresponding tRNA. The thermal cycler conditions were as follows: 10 min at 25°C, 60 min at 37°C, and 5 min at 85°C. Following the retrotranscribed reaction, a standard PCR was performed using 2 μl of cDNA. The primers used were oligonucleotides numbers 1 and 6 of the corresponding tRNA. The PCR cycle was as follows: 94°C, 30 s at 55°C, and 90 s at 72°C (35 cycles); 10 min at 72°C (1 cycle). The PCRs were purified using the NucleoSpin Gel and PCR Clean-up kit (reference number 740609.250; Macherey-Nagel). Samples were sent for sequencing using the same oligos used for the PCR as sequencing primers.

Selection of bacterial species for analysis (few versus many tRNAs)

To evaluate the hypothesis that the constraints in tRNA variation depend on the total number of tRNA genes a species has, we selected bacterial species with either few or many tRNA genes. To evaluate the number of tRNA genes of all the species of bacteria, we made a distribution of their relative numbers using the Genomic tRNA Database (

16S ribosomal RNA species evolutionary divergence

The 16S ribosomal RNA (rRNA) gene for all candidate species of bacteria (with either few or many tRNA genes) analyzed was downloaded from the National Center for Biotechnology Information (NCBI). We estimated the evolutionary divergence of the species by pairwise sequence alignment of the 16S rRNA gene using T-Coffee multiple sequence alignment package version 11.00.8cbe486 and selected those for which the predicted time of divergence was similar (40- and 37-nt differences for each respective pair of species).

We selected E. canis str. Jake and E. ruminantium str. Welgevonden as the representative pair of species with few tRNA genes and E. coli CFT073 and S. enterica subspecies I, serovar Typhimurium strain LT2 (S. typhimurium LT2) as the species with many tRNA genes.

tRNA orthologs pairwise comparison

The orthology of the tRNA genes for each pair of species was preliminarily assessed with the best reciprocal hit (BRH) approach by using BLAST (Basic Local Alignment Search Tool) package version 2.2.28. In addition, to fully resolve the orthology between the pairs of tRNA genes, we constructed a synteny map between each pair of species using their complete genomes.

The complete genome of all species analyzed and their features were downloaded from NCBI. We performed a full-genome alignment between each pair of species (E. canis str. Jake versus E. ruminantium str. Welgevonden and E. coli CFT073 versus S. typhimurium LT2) using the progressive Mauve alignment of the Mauve genome alignment package version 2.3.1 with default parameters and inferred the positional orthologs of the tRNA genes with a minimum coverage of 90% aligned sequence and 90% identity. The flanking genes of each tRNA gene pair in the full-genome alignment of each pair of species had to be the same to be considered syntenic.

Once the orthology was resolved by BRH and synteny, we carried out pairwise alignments of each tRNA ortholog gene from each selected species group using a combination of T-Coffee multiple sequence aligner and its in-built R-Coffee secondary structural alignment package with additional manual curation and calculated the number of differences between the two (table S1).

MD simulations: Limitations and system setup

Predicting the structure and folding of RNA has been particularly challenging using all-atom MD simulations for a number of reasons. (i) Instead of the few hydrogen bond (hbond) pairings observed in DNA, 38 different types of canonical and noncanonical pairing were observed in RNA molecules (27). (ii) This wide variety of distinct possible pairings between RNA bases often involves long-range interactions (even thousands of nucleotides away) (28). (iii) Structural transitions leading to the final folded tertiary structures of RNA are slow (seconds to hours), whereas base pair melting, reshuffling, and tertiary pairing occur in the microsecond to second time scale (29). (iv) Only in tRNA molecules, 107 naturally occurring chemical modifications were identified (30), providing new functional and structural characteristics of canonical nucleobases. For these reasons, we focused our simulations only on the anticodon loop that is composed of a single-stranded RNA of 17 bp, which folds into a stem-like A-RNA, formed by 5 bp stabilized with Watson-Crick (WC) hbonds. The remaining seven nucleobases form a loop stabilized by stacking and hbond interactions, some specific to tRNA anticodon loops (31). Using this reduced system, only a few modified nucleobases are needed, for which the specific force field parameters are known. Moreover, small structural rearrangements occur in the nanosecond time scale (32) where modern force fields have already proven to correctly predict the folded state of small RNA, including loops (33, 34).

All systems were built using, as starting geometry, the high-resolution x-ray structure of tRNAPhe with PDB code 2TRA (35). Each specific sequence studied was obtained by mutating residues 27 to 43 in tRNAPhe (corresponding to the anticodon loop) using ModeRNA (36). The systems were then solvated by adding TIP3P waters in a truncated octahedral box (10 Å around the solute) and neutralized by adding 16 K+. Smith-Dang (37) ion models were considered, and extra salt (K+Cl) was added up to an ionic strength of 0.15 M. Counterions were initially placed randomly, at a minimum distance of 5 Å from the solute and 3.5 Å from one another. All the systems were minimized, thermalized, and preequilibrated using our standard multistep protocol (38) followed by 150 ns of equilibration, putting restraints to keep the U-turn hbond formed (the hbond was restrained by defining a flat-welled parabola, which becomes linear beyond 3.9 Å with a force constant of 5.0 kcal/mol Å2). All the systems were then simulated (production runs) for 1 μs, during which the first 50 ns was used to smoothly remove the restraints applied to the U-turn.

MD simulation details

All systems were simulated in the isothermal-isobaric ensemble (P = 1 atm, T = 298 K) using the Berendsen algorithm (39) to control the temperature and the pressure, with a coupling constant of 5 ps. Center-of-mass motion was removed every picosecond to limit the translational kinetic energy of the solute. SHAKE (40) was used to keep all bonds involving hydrogen at their equilibrium values, which allowed us to use a 2-fs step for integration of Newton equations of motion. Long-range effects were accounted for by using the particle mesh Ewald method (41) with standard defaults and a real-space cutoff of 10 Å. The parmbsc0-χOL3 force field (42, 43) was used to represent RNA interactions. All simulations were done using the pmemd CUDA code (44) of AMBER 14.

Structural signatures of “active” tRNA loops

The crystal structure of the fully unmodified yeast tRNAPhe revealed the structural signatures of what are now considered to be common RNA structural motifs of functional tRNA anticodon loops. In this sense, our structural and dynamic analysis was centered on three different structural signatures thought to be present in functional anticodon loops. (i) A marked backbone turn on the loop region, stabilized by one or more intramolecular hbonds (45). Some of these hbonds have been well characterized (U-turn and A-turn) and are also present in other types of RNA molecules (46). This is particularly true for the U-turn motif observed in the crystal forms of several functional tRNAs, such as tRNAPheGAA (47), tRNACysGCA (48), and tRNALysUUU (49). (ii) Y32 with the χ angle in the anti conformation and correctly stacked with the next base (5′-3′ direction). (iii) A correct relative of the anticodon triplet, pointing toward the wide groove (bases 34 to 36) (35). Apart from these specific features, the correct pairing of the bases in the stem (mostly WC and wobble hbonds) and the fluctuation of the stem around the canonical A-form were also considered as structural signatures required for active anticodon loops.

Tools for the analysis of MD simulations

During production runs, data were collected every 1 ps (leading to a collection of 1 million structures per sequence), which allowed us to study infrequent but fast movements. All the trajectories were postprocessed with the CPPTRAJ module of the AMBERTOOLS 15 package (50), the NAFlex server (51), and local tools developed in the group ( The substates of the torsional angles of the backbone (α and ζ) were categorized following the standard definition: gauche positive (g+) = 60° ± 40°, trans (t) = 180° ± 40°, and gauche negative (g) = 300° ± 40°. All the hbond distances (calculated with CPPTRAJ) were measured from the donor and acceptor heavy atoms and can be considered to be formed below 3.5 Å. The backbone turn, the stacking interactions at position 32, and the relative orientation of the anticodon trimer were addressed by visual inspection of the trajectories and by creating molecular graphics using VMD (52). All the trajectories were clustered using the average linkage clustering algorithm (53) based on the root mean square deviation of the backbone-heavy atoms to quantify the time “spent” in each conformational substrate. The conformation of the stem was analyzed in the helical space along simulations using Curves+ (54).


Supplementary material for this article is available at

fig. S1. Electrophoretic mobility shift assays.

fig. S2. Deamination assays.

fig. S3. Dynamics of the tRNAGlyGCC anticodon loop.

fig. S4. Dynamics of the tRNAGlyCCC anticodon loop.

fig. S5. Dynamics of the tRNAGlyACC (GCC) anticodon loop.

fig. S6. Dynamics of the tRNAGlyACC (CCC) anticodon loop.

fig. S7. Dynamics of the tRNAArgACG anticodon loop.

fig. S8. Dynamics of the tRNAAlaAGC anticodon loop.

fig. S9. Analysis of the posttranscriptional modification patterns of anticodon stem-loops of ADAT substrates and tRNAGly isoacceptors in eukaryotes.

table S1. Number of nucleotide differences between pairs of orthologous tRNA genes for the selected species.

Reference (55)

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank D. Söll, H. Grosjean, and L. Filonava for comments and suggestions. M.O. and P.D.D. thank the Barcelona Supercomputing Center for CPU/GPU time on MareNostrum/MinoTauro. P.D.D. is a PEDECIBA (Programa de Desarrollo de las Ciencias Básicas) and an SNI (Sistema Nacional de Investigadores) (ANII, Uruguay) researcher. Funding: This work was supported in part by the Spanish Ministry of Economy and Competitiveness (grants BIO2012-32200, Sev-2012-0208, and BIO2012-32868 to L.R.d.P., F.A.K., and M.O., respectively) and by the Catalan Government (grants 2014-SGR-0771, 2014-SGR-0974, and 2014-SGR-0134 to L.R.d.P., F.A.K., and M.O., respectively). This work was also supported by the Howard Hughes Medical Institute International Early Career Scientist Program (55007424), by a European Research Council (ERC) Starting Grant (335980_EinME to F.K.), and by a grant from the ERC (ERC_SimDNA to M.O). A.G.T. and C.B. are funded by the Spanish Ministry of Economy and Competitiveness (FPDI-2013-17742 and BES-2013-064004, respectively). Author contributions: A.S.-L., E.M.N., and N.C. cloned the constructs, purified the tRNAs and the recombinant GlyRS, and performed all biochemical assays. E.M.N. helped in the analysis of the tRNA gene frequencies. A.S.-L. collected and analyzed data with L.R.d.P. C.B. collected, processed, and (with F.A.K.) analyzed orthologous tRNA pairs. P.D.D. performed and (with M.O.) analyzed the MD simulations. A.G.T. analyzed the posttranscriptional modification patterns. L.R.d.P. designed the study. L.R.d.P. and A.S.-L. wrote the paper with contribution from the rest of the authors. All authors discussed the results and commented on the manuscript. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All the data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.
View Abstract

Stay Connected to Science Advances

Navigate This Article