Research ArticleMICROBIOLOGY

Significantly enhancing production of trans-4-hydroxy-l-proline by integrated system engineering in Escherichia coli

See allHide authors and affiliations

Science Advances  22 May 2020:
Vol. 6, no. 21, eaba2383
DOI: 10.1126/sciadv.aba2383


Trans-4-hydroxy-l-proline is produced by trans-proline-4-hydroxylase with l-proline through glucose fermentation. Here, we designed a thorough “from A to Z” strategy to significantly improve trans-4-hydroxy-l-proline production. Through rare codon selected evolution, Escherichia coli M1 produced 18.2 g L−1 l-proline. Metabolically engineered M6 with the deletion of putA, proP, putP, and aceA, and proB mutation focused carbon flux to l-proline and released its feedback inhibition. It produced 15.7 g L−1 trans-4-hydroxy-l-proline with 10 g L−1 l-proline retained. Furthermore, a tunable circuit based on quorum sensing attenuated l-proline hydroxylation flux, resulting in 43.2 g L−1 trans-4-hydroxy-l-proline with 4.3 g L−1 l-proline retained. Finally, rationally designed l-proline hydroxylase gave 54.8 g L−1 trans-4-hydroxy-l-proline in 60 hours almost without l-proline remaining—the highest production to date. The de novo engineering carbon flux through rare codon selected evolution, dynamic precursor modulation, and metabolic engineering provides a good technological platform for efficient hydroxyl amino acid synthesis.


Hydroxyprolines (Hyps) are l-proline (l-Pro) derivatives hydroxylated at C3 or C4 with different enantioselectivities (1). Hyps have been widely used in fields such as biochemistry, food production, and cosmetics development (1). Hyps are also heavily involved in many secondary metabolic pathways for the biosynthesis of various drugs, such as chemotherapeutic actinomycin, antifungal echinocandin, antibiotic etamycin, and anti-inflammatory oxaceprol (2). Trans-4-hydroxy-l-Pro (4-Hyp) is among the most widely used Hyp compounds.

Traditionally, 4-Hyp has been extracted from the acid hydrolysis of animal collagen using a polluting and energy-intensive process. The alternative chemical synthesis of 4-Hyp from imidazole compounds is costly and inefficient (3). In recent years, researchers have explored the microbiological manufacture of 4-Hyp. A specialized class of trans-prolyl-4-hydroxylases (P4Hs) has been reported to convert l-Pro into 4-Hyp, with P4H from Dactylosporangium sp. RH1 showing particularly high activity (4). Shibasaki et al. (5) constructed a recombinant Escherichia coli W1485 strain harboring Datp4h gene encoding P4H, which produced 4-Hyp at a concentration of 41 g liter−1 in 100 hours when fed with l-Pro.

However, using l-Pro as the feed for 4-Hyp production is uneconomical owing to its high cost (US $20/kg). Many l-Pro analogs, such as 3,4-d,l-dehydroproline and l-azetidine-2-carboxylate, have been used to screen for l-Pro overproducers, which better disrupted the feedback sensitivity of γ-glutamyl kinase in the pathway (6). Unfortunately, some of these analogs are degraded by l-Pro dehydrogenase and yield false positives in the form of strains with l-Pro dehydrogenase mutants (7). Furthermore, some mutagenesis and screens have proven that mutants in permease genes are 100-fold more frequent than the targeted mutations, as most l-Pro analogs can act as substrates for l-Pro transport systems (8). Recently, Zheng et al. reported that genes containing rare codons had lower transcription levels and could suffer from undertranslation owing to the scarcity of appropriate transfer RNAs (tRNAs). These “rare” genes might be used up for the translation of more “common” genes with higher transcription levels when the cell has insufficient levels of amino acids. This can lead to a decline and even stagnation in the translation process of these rare codons (9). Increasing the intracellular concentration of amino acids could ensure most tRNAs remain charged with amino acids, reducing competition and ensuring the normal translation of rare codons. A rare codon selection system has recently been applied to the screening of l-leucine, l-arginine, and l-serine overproducers from mutant libraries (9).

Similar to l-isoleucine dioxygenases (IDOs) that convert l-isoleucine (l-Ile) into 4-hydroxyisoleucine (4-HIL), P4Hs are α-ketoglutarate (α-KG)–dependent mononuclear nonheme iron-containing enzymes. They use α-KG and O2 as substrates when catalyzing the hydroxylation of l-Pro, which is coupled with the oxidation of α-KG to succinate. Therefore, the α-KG concentration has a significant effect on l-Pro hydroxylation. Zhang et al. overexpressed α-KG dehydrogenase (ODHC) gene and knocked out isocitrate lyase gene aceA in E. coli, where exogenous IDO shunted the flux of α-KG to the pathway of 4-HIL production (10). The other approach to redirecting the flux of α-KG to 4-HIL was ODHC inhibition. Man et al. (11) used a variant ribosome binding site to replace the natural sites of the odhA gene in Corynebacterium crenatum to reduce the ODHC activity, transferring carbon flux to the d-arginine pathway. Novel synthetic small RNAs (sRNAs), consisting of the RNA-chaperone and binding sequences of targeted genes, have recently been used to degrade target mRNA (12) to enhance putrescine production in E. coli (13). sRNA has also been used to regulate the transcriptional level of odhA to reduce the activity of ODHC in Corynebacterium glutamicum, resulting in an increased l-glutamate yield (14). However, direct inhibition of ODHC to reduce α-KG flux significantly affected cell growth at the early stage, resulting in a longer fermentation process.

Furthermore, dynamic regulation has better flexibility and adjustability between growth and production, switching flux redirection only at certain key moments (15). Zhang et al. used biosensor Lrp to dynamically regulate ODHC activity in C. glutamicum, which resulted in an excellent 4-HIL yield of 34.21 g liter−1 (16). These results proved that 4-HIL production was enhanced without affecting cell growth at the early stage. However, the application of similar biosensors to other biosynthesis pathways is currently limited.

In this study, we performed rare codon–selected evolution (17) to improve l-Pro production from glucose in both E. coli and Serratia marcescens JNB5-1. Metabolic engineering strategies were then conducted to further enhance l-Pro biosynthesis. Furthermore, a tunable circuit based on quorum sensing was used for dynamic attenuation of ODHC activity (18) to increase the supply of α-KG by switching the flux of the tricarboxylic acid (TCA) cycle to 4-Hyp biosynthesis. Last, a double mutant derived from esnapd2 with higher catalytic conversion efficiency through genome mining and rational design was used to increase 4-Hyp production and l-Pro consumption. This study reports a novel strategy for the modification of 4-Hyp producer, combining amino acid screening with metabolic engineering and dynamic control of key enzymes in the TCA cycle. This strategy will be applicable as a general modification method for the high-yielding production of other hydroxyl amino acids.


Rare codon selection of l-Pro in E. coli and S. marcescens

l-Pro has four redundants, namely, CCA/CCC/CCG/CCT (Fig. 1A) (19), among which CCC has the lowest usage of 0.55% in E. coli. To verify whether the rare codon–based selection system was effective for l-Pro screening, 7 and 15 codons for Pro in the native kanamycin resistance gene (kanR) were substituted with CCC, giving kanR-H and kanR-A, respectively (Fig. 1B). S. marcescens JNB5-1, which endogenously converts l-Pro into colored secondary metabolite prodigiosin, was also selected to verify the validity of rare codon selection for l-Pro. Adding l-Pro led to the color of the culture solution deepening, which can be roughly characterized optically with an increasing amount of prodigiosin (fig. S1A). S. marcescens JNB5-1 also boasts various resistances against many antibiotics. We finally selected the apramycin resistance gene (apmR) of S. marcescens JNB5-1 and unpossessed and substituted 11 and 16 l-Pro codons of the native apmR gene (apmR-W) into CCA (0.27% usage in S. marcescens), as shown in Fig. 1A, giving apmR-H and apmR-A, respectively (Fig. 1B). All rare codon–substituted genes were cloned into pET28a with kanR and apmR genes and then transformed into E. coli BL21. Others with only apmR were transformed into S. marcescens JNB5-1. Strains cultured in M9 medium showed no significant growth difference, even in 1× Luria-Bertani (LB) medium.

Fig. 1 Rare codon–based selection for E. coli and S. marcescens JNB5-1.

(A) l-Pro codon usages in E. coli (red ●) and S. marcescens (blue ●). (B) On the basis of different l-Pro codon preferences in both strains, 7 and 15 Pro codons of kanR were substituted with CCC for E. coli, giving kanR-H and kanR-A, respectively. For S. marcescens JNB5-1, apmR was selected, and 11 and 16 of its l-Pro codons were substituted into CCA in the same manner as kanR, giving apmR-H and apmR-A, respectively. Native kanR and apmR were named as kanR-W and apmR-W, respectively. (C) All genes in (B) were cloned into pET28a, and the resultant plasmids were transformed in E. coli BL21 (gray background) and S. marcescens JNB5-1 (white background), which were then inoculated into 0.2× LB liquid medium for 24 hours. In particular, kanamycin was used to select strains harboring pET28a-kanRs, while apramycin was used to select strains harboring pET28a-apmRs. (D) All strains in (C) were cultured in 0.2× LB with l-Pro (1 g liter−1; red lines) and 0.2× LB without feeding with l-Pro (blue lines) for 24 hours, respectively. *P < 0.05.

Fig. 2 Workflow of rare codon–based selection assisted by in vivo evolution in E. coli.

MP6ts was first transformed into E. coli BL21, giving E. coli/MP6ts, which was then cultured in 1× LB containing 0.1 mM arabinose to induce mutation for 24 hours at 30°C. The resulting strains were then cultured at 42°C to guarantee elimination of MP6ts, which showed growth inhibition in the LB medium with a corresponding antibiotic. The libraries of E. coli BL21 were made into competent cells and transformed with pET28a-A for l-Pro overproducer selection. The cells exhibiting superior growth in 0.2× LB (with the corresponding antibiotic) were selected and inoculated into TY medium (50 ml) for 24 hours. The l-Pro concentration in these cultures was measured by HPLC.

These strains were then inoculated in nutrient-restricted 0.2× LB medium, where obvious growth disparities in both E. coli BL21/pET28a-kanRs and S. marcescens JNB5-1/pET28a-apmRs were observed (Fig. 1C). In E. coli BL21/pET28a-kanRs, the difference in OD600 (optical density at 600 nm) between the E. coli BL21/pET28a (EKW) and E. coli BL21/pET28a-A (EKA) strains was up to 2.15-fold and up to 4.71-fold between the S. marcescens/pET28a-Wa (SW) and S. marcescens/pET28a-Aa (SA) strains. E. coli BL21, containing the pET28a-apmRs gene, was used to test the rare codon frequency against antibiotics. The differences between them were undetectable, as the CCA codon had a fairly high usage in E. coli (Fig. 1C). These findings indicated that rare codon screening was highly relative and varied between strains.

Amino acid exposure experiments were also performed to verify cell growth recovery by adding l-Pro (1 g liter−1) to the 0.2× LB medium (Fig. 1D). The results indicated that the growth of E. coli BL21 harboring pET28a-apmRs remained unchanged. However, the OD600 was restored by 84.2% with the addition of l-Pro for EKA and was recovered by 150.0% in SA after 18 hours. In recent studies, efficient l-Pro biosynthesis has been successfully achieved in both E. coli (13) and C. glutamicum (20, 21) through rational reconstruction using innovative synthetic biology tools. Other techniques, such as multiplex automated genome engineering (22), have shown advantages in creating large-scale genomic diversity for high-throughput metabolic engineering, although an efficient method for the screening of l-Pro overproducers remains to be developed. Our results suggested that screening for strains exhibiting l-Pro overproduction using the rare codon selection method was feasible.

Effect of E. coli and S. marcescens JNB5-1 mutants on l-Pro production from glucose

The results above showed that the rare l-Pro codon had an inhibitory effect on cell growth by retarding the translation of antibiotic resistance proteins, although the l-Pro concentration effectively alleviated this inhibition. This provided a new method for screening l-Pro overproducers. To apply this system to the screening of bacteria producing l-Pro, two rare codon plasmids, pET28a-A and pET28a-Aa, were used. The mutant library of E. coli BL21 was prepared as shown in Fig. 2. Notably, the previous replicon in plasmid MP6 had been replaced with temperature-sensitive replicon pSC101 ori (MP6ts), such that the plasmid could be isolated for further research. MP6ts was transformed into E. coli BL21 before induction for mutation. After 24 hours of incubation at 30°C with arabinose, the culture was incubated at 42°C to excrete the plasmid for construction of a mutant library. Atmospheric and room temperature plasma (ARTP) was used to build a mutation library of S. marcescens JNB5-1. Both libraries of E. coli BL21 and S. marcescens JNB5-1 were made into competent cells and transformed with pET28a-A and pET28a-Aa, respectively, for l-Pro overproducer selection. The resulting mutants were cultured on kanamycin or apramycin in 0.2× LB and then isolated for further culture.

After 12 hours of cultivation in 0.2× LB containing kanamycin or apramycin, the average OD600 values of the E. coli and S. marcescens JNB5-1 mutants were 0.183 and 0.165, respectively. Twenty strains of E. coli mutants with the highest growth were selected, and their l-Pro contents were determined after 24 hours in a shake flask containing TY medium. Twelve of the strains showed increased l-Pro productivity, among which strain PM-14 reached a concentration of 0.816 g liter−1 (Fig. 3A), outperforming the wild type more than twofold. Among the S. marcescens mutants, 31 strains with the highest growth were selected to be cultured in TY medium for prodigiosin yield verification in a shake flask. Sixteen of them showed an enhanced prodigiosin yield (fig. S1B), with LK-18 presenting a yield 3.3 times higher than that of S. marcescens JNB5-1 (Fig. 3B), with the l-Pro yield increased by 30.2% (Fig. 3C). To further verify whether the increase in endogenous l-Pro was a result of enhanced prodigiosin production, pigI (first step converting l-Pro toward prodigiosin) was knocked out in both the wild-type and the mutant strain LK-18. The results were consistent with expectations, with the l-Pro yield found to be 2.6 times that of S. marcescens JNB5-1∆pigI. After eliminating plasmids for the selection of mutant strains, no significant growth difference was observed between the mutant and wild types in both E. coli and S. marcescens JNB5-1, indicating that the mutants maintained some level of genetic stability (Fig. 3D). Although possible changes to the mRNA secondary structure have been found in both the kanRs and apmRs genes (fig. S2A), most of the selected mutants showed no obvious changes to the transcription of l-Pro tRNA, except for PM-10, PM-11, and PM-17, which showed slightly positive up-regulation of tRNA (GGG) (fig. S2B). The key to rare codon selection was that increasing the cytoplasmic concentration of a targeted amino acid can increase the concentration of charged tRNA for efficient expression of resistant genes, making it available to screen overproducing strains of amino acid (9). S. marcescens JNB5-1 has previously been used to screen l-Pro overproducers as an indirect support of rare codon selection validity, measuring the synthesis of prodigiosin from l-Pro (23). Furthermore, the highest reported yield of l-Pro (100 g liter−1) was produced by S. marcescens SP511 (24). LK-18 was outstanding among the mutant strains for rare codon selection, affording a higher yield of prodigiosin than that from the wild-type strain fed with 1 g liter−1 of l-Pro (fig. S1A). The accumulated l-Pro titer was increased by 2.5-fold in the LK-18 strain in which pigI, which encoded the enzyme involved in the prodigiosin pathway, was deleted. Unlike S. marcescens JNB5-1 mutated by ARTP, our E. coli mutant library was processed in vivo, using the MP6ts, which had the advantages of reducing the complexity of the library procedure and facilitating the potential for adaptive evolution, to enable rare codon selection–assisted evolution.

Fig. 3 Rare codon–based selection for E. coli and S. marcescens.

(A) l-Pro concentration of various E. coli mutants screened by the rare codon selection system after cultivation in fermentation medium at 37°C for 24 hours. (B) Color comparison between S. marcescens JNB5-1 wild-type (WT) and mutant LK-18 cultured in fermentation medium at 28°C for 8, 16, and 32 hours, respectively. (C) l-Pro and prodigiosin production of S. marcescens JNB5-1 wild-type and mutant LK-18, and those of pigI deleted in corresponding strains, all cultivated in fermentation medium at 28°C for 12 hours. (D) Growth of S. marcescens JNB5-1 and mutant LK-18 (solid line and dashed line combined with quadrate node, respectively), and of E. coli BL21 and its mutant PM-14 (solid line and dashed line combined with trigonal node, respectively) in M9 medium with corresponding antibiotics.

Promoting l-Pro biosynthesis using metabolic engineering to concentrate carbon flux to l-Pro and release its feedback inhibition

To figure out which step in the l-Pro biosynthesis pathway was responsible for increased l-Pro production, RNAs from both the wild-type and PM-14 strains were extracted at the logarithmic phase in 0.2× LB medium for comparative transcriptome analysis in GENEWIZ. Transcriptomics of the l-Pro pathway between PM-14 and the wild-type are shown in Fig. 4A, with positive and negative values representing up-regulated and down-regulated expression of pathway genes, respectively.

Fig. 4 Transcriptional profiles of l-Pro biosynthesis pathway and further metabolic strategies for overproducing l-Pro in PM-14.

(A) After harvesting strains at the exponential phase in 0.2× LB medium, transcriptome analyses of Embden–Meyerhof–Parnas (EMP), the TCA cycle, and l-Pro relative biosynthesis or the metabolism pathway in PM-14 were compared with those of the wild type (WT), with red and blue circles representing positive (up-regulated genes) and negative (down-regulated genes) values, respectively, calculated in reads per kilobase of transcript per million mapped reads of PM-14/wild type (in log2). (B) qRT-PCR verification of key genes in the l-Pro biosynthesis pathway of both PM-14 and LK-18, followed by a conceptual map of l-Pro overproduction. G6P, 6-phosphoglucose; F6P, 6-phosphofructose; FBP, 1,6-bisphosphofructose; GAP, 3-phosphoglyceraldehyde; GBP, 1,3-diphosphoglycerate; 3GP, 3-phosphoglycerate; 2PG, 2-phosphoglycerate; PEP, phosphoenolpyruvate; PYR, pyruvate; AcCoA, acetyl-CoA; OAA, oxaloacetate; CIT, citrate; IsoCit, Isocitrate; AKG, α-ketoglutarate; SucCoA, succinyl-CoA; Succ, succinate; Fum, fumarate; Mal, malate; Glu, l-glutamate; Pro, proline; P5C, l1-pyrrolidine-5-carboxylate.

Metabolic pathways related to l-Pro synthesis include glycolysis, the TCA cycle, the l-Pro synthesis pathway, and the l-Pro transport system. In the analysis of PM-14, genes encoding glycolytic enzymes were found to have higher expression levels compared with the wild-type strain, which was consistent with the higher glucose consumption of PM-14. In the TCA pathway, icd and acn, encoding isocitrate dehydrogenase and aconitase, showed significantly increased expression, which enhanced the flux to α-KG. Conversely, the aceA and glnA genes showed lower expression in the mutant strain, which were disadvantageous for l-Pro accumulation. The proB gene, encoding the rate-limiting enzyme in the pathway, had a much higher expression level in the mutant strain. A quantitative reverse transcription polymerase chain reaction (RT-qPCR) was used to verify the expression of four key genes—pyk, acn, proB, and glnA—in both PM-14 and LK-18 strains. The results showed that pyk, acn, and proB had increased expression, while glnA involved in l-Pro catabolism had decreased. The proP and putP expression that encoded enzymes for l-Pro transport systems increased l-Pro production 2.3-fold and 3.1-fold in PM-14, respectively. PutP or ProP has previously been reported to assist l-Pro uptake during l-Pro starvation (25). The proP and putP genes, involved in the l-Pro transport system, showed up-regulation, which might enhance l-Pro uptake during l-Pro starvation. These two genes are involved in transporting extracellular l-Pro inside the cell, and their transcription was promoted, especially when the amount of l-Pro in the medium was insufficient to efficiently express resistant protein, resulting in the restoration of growth. However, in the subsequent construction of l-Pro chassis cells, intracellular l-Pro is supposed to accumulate outside the cell, instead of extracellular l-Pro continuing to be transported. After knocking out the two genes, the l-Pro yield in the medium was further improved.

Starting with mutant M1, derived from PM-14 with pET28a-A eliminated, the proP, putP, putA, and aceA genes were removed to achieve maximum preservation of l-Pro flux in the downstream metabolic engineering (Fig. 4B). Furthermore, we introduced D107N and E143A mutations into the proB gene to eliminate feedback inhibition (table S1) (26). The l-Pro titer of M1 reached 18.2 g liter−1 after batch fermentation in a 7.5-liter fermenter fed with fermentation medium, while that of M6 was, to our surprise, 2.06 times higher, reaching 37.5 g liter−1 (fig. S3A). The highest production concentration of l-Pro, produced by strain M6, was comparable to the recently reported titer in E. coli (13), underlining the potential for industrial scale-up. Furthermore, M6 showed the lowest amount of by-products (fig. S3B).

Effect of dynamic modulation of ODHC activity on α-KG flux to 4-Hyp

To efficiently produce 4-Hyp, Datp4h, driven by constitutive promoter Ptac-M, was cloned into pDXW-10 vector and transformed into strain M6R to construct strain HYPA. Recombinant strain HYPA yielded 4-Hyp at a concentration of 15.7 g liter−1, with 18.1 g liter−1 of l-Pro remaining (table S2). To drive α-KG to achieve efficient hydroxylation, ODHC should be inhibited. However, blocking direct flux in the TCA cycle by knocking out ODHC might affect cell growth (27). Metabolic branch points for α-KG were targeted on the basis of the hypothesis that a controllable switch in the circuit would be advantageous for 4-Hyp production through the dynamic adjustment of sucA tuning. The native promoter of sucAB was substituted with the PeaS promoter, and its transcription was triggered upon binding with transcriptional regulator EsaRI70V (encoded by esaR) with no 3-oxohexanoylhomoserine lactone (AHL) present. In contrast, the active transcription of the PeaS promoter was inhibited by gradual accumulation of intracellular AHL produced by easI (encoding AHL synthase), which disrupted the binding of the PeaS promoter with EsaRI70V. The different AHL accumulation rate could down-regulate the PeaS promoter at variable times to obtain suitable α-KG moderation (Fig. 5A). Therefore, gene esaI was integrated to construct M6R, in which artificial promoters Pbss (28) with different intensities (strength, Pbs1 < Pbs2 < Pbs3 < Pbs) were inserted in front of esaI to drive AHL synthesis with various timings, generating strains M7-1, M7-2, M7-3, and M7-H. To test the performance of this quorum sensing–based switch, pDXW-10 with the gfpdt gene (attached with a degradation tag at the C termini for rapid degradation) driven by the PeaS promoter was transformed into the M7 series, generating new strains M7-1gfp, M7-2gfp, M7-3gfp, and M7-Hgfp, respectively. M7-Hgfp displayed the weakest fluorescence intensity as the PeaS promoter was inhibited by generated AHL (Fig. 4B), which indicated dynamic control of green fluorescent protein (GFP) expression. To regulate ODHC activity, the original promoter of sucA was replaced with PeaS in the M7 series, giving M8-1, M8-2, M8-3, and M8-H, into which pDXW-10-Datp4h was subsequently transformed, generating strains HYPH, HYP3, HYP2, and HYP1, respectively, with M6R/pDXW-10-Datp4h serving as a control, denoted as HYPA. Both the enzymatic activity of ODHC and the qRT-PCR of sucA showed no decrease in HYPA, while ODHC in other strains declined to different extents in a time-dependent manner (Fig. 4C). HYPH suffered from growth arrest due to higher easI expression. Although the ODHC activity of HYP2 was decreased by 31.2% after 24 hours, its biomass remained similar to those in other HYPA strains. Succinic acid resulting from l-Pro hydroxylation might contribute to supplying the TCA cycle to sustain growth. In M8-1, where the native promoter of sucA was replaced with PeaS but with no pDXW-10-PeaS-Datp4h plasmid inside, l-Pro production increased by 14.7% compared with M6R (fig. S3C). Zhang et al. (16) improved 4-HIL biosynthesis by switching flux from α-KG to hydroxyl amino acids using Lrp-PbrnFE as an l-isoleucine biosensor to dynamically control ODHC activity. Similarly, a pathway-independent quorum sensing circuit was used to dynamically regulate ODHC activity (18), which had advantages of substantial regulation, an economical inducer, and industrial scalability. In our system, esaI was driven by Pbss promoters of varying strengths to control AHL generation, which could be transplanted into Bacillus subtilis or Saccharomyces cerevisiae owing to the high compatibility of Pbss (28).

Fig. 5 Dynamic regulation of ODHC activity to improve 4-Hyp production.

(A) Schematic diagram of dynamic regulation of ODHC based on quorum sensing circuit. Native promoter of sucAB was substituted with the PeaS promoter, and its transcription was triggered upon binding with transcriptional regulator EsaRI70V, with no AHL expressed. Active transcription of PeaS promoter would be inhibited by gradual accumulation of intracellular AHL produced by easI, disrupting the binding of PeaS promoter with EsaRI70V. Various esaI expressions driven by four different Pbs promoters led to dynamic accumulation of AHL, which down-regulated the PesaS promoter at variable times to obtain suitable α-KG moderation. This promoted 4-Hyp production by Dstp4h exogenously cloned in pDXW-10. (B) pDXW-10 with the gfpdt gene driven by the PeaS promoter was transformed into the M7 series. Five fluorescent profiles of strains (M7s with pDXW-10-PeaS-gfpdt) were measured to test the performance of this quorum sensing–based switch. a.u., absorbance unit. (C) Transcriptional level of sucA and ODHC activity in HYPA, HYP1, HYP2, HYP3, and HYPH at 0, 6, 12, 18, and 24 hours, respectively. The strains were previously cultured in TY medium at 37°C to obtain a logarithmic phase. Data were the mean values derived from triplicate measurements, while error bars showed the SD (n = 3).

Efficient production of 4-Hyp by genome mining and rational design of engineered strain P4H

The superior HYP2 harboring Datp4h reached 43.2 g liter−1 of 4-Hyp, which was 2.75 times higher than that achieved by HYPA (fig. S3D), but 4.3 g liter−1 of l-Pro still remained. Further promotion of TPH activity was required to improve the conversion rate of l-Pro. Genome mining has been used to effectively mine P4Hs, because a lack of P4Hs has been found and characterized in microorganisms (29). In this study, P4H in uncultured bacterium encoded by esnapd2 (E2), derived from the evolutionary tree based on a homology-based search by Sun et al. (29), was selected for further experiments and showed distinct activity for l-Pro hydroxylation (fig. S4A). The pocket alignment of E2 and, reportedly, P4H showed that their catalytic centers were almost identical except for L170 and P172 (fig. S4B). E2P172D and E2P172N showed 16.1 and 26.6% increases in enzyme activity, respectively, while L170A of E2L170A might contribute to the 1.38-fold higher 4-Hyp yield compared with that of E2WT (fig. S4C). Double mutant E2L170A/P172N increased the enzyme activity by 27.2% and the 4-Hyp yield by 42.3%. Docking results showed that the substitution of L170A and P172N could introduce extra hydrogen bonds with substrates (Fig. 6, A and B), which might help stabilize the intermediate state. Furthermore, L170 was changed into smaller residue Ala, which optimized substrate channeling (fig. S4, D and E). To further validate our hypothesis, molecular dynamics (MD) simulations of the protein complexes were performed using GROMACS 2018.4 software. The root-mean-square fluctuation (RMSF) of 170 to 175 loops of E2L170A/P172N was lower than that of E2WT (Fig. 6E), and the “mouth” toward the substrate binding pocket seemed to be more outstretched from conformational snapshots of both E2WT and E2L170A/P172N (Fig. 6, C and D), corresponding to their mode of motion (fig. S4, F and G). Rosetta tool Cartesian_ddg was also used to measure ΔΔG between enzyme and l-Pro, which indicated that E2L170A/P172N (ΔΔG = −3.27 kJ mol−1) promoted the stability of the enzyme-ligand complex (Fig. 6F).

Fig. 6 Structural and molecular dynamic analysis of E2WT and E2L170A/P172N.

(A and B) Docking results of (A) E2WT and (B) E2L170A/P172N. Plots were prepared using Protein Ligand Interaction Profiler (PLIP) (40) and Pymol tools. Catalytic triplets were presented as red sticks and mutational residues as green sticks, while hydrogen bonds, hydrophobic interactions, and salt bridges are shown as blue, gray, and yellow dashed lines, respectively. (C and D) Superposed snapshots of the MD trajectories of (C) E2WT and (D) E2L170A/P172N mutants after 30-ns MD simulations of the protein complexes by GROMACS. (E) Differences in RMSF values calculated from MD simulations between E2WT and E2L170A/P172N in all residues. The loop containing the selected residues is highlighted in orange. (F) Rosetta was used to calculate the binding energy of point mutations. ΔΔG was measured by Cartesian_ddg of a Rosetta tool between mutants and substrate l-Pro.

To further enhance 4-Hyp production, esnapd2L170A/P172N was cloned into pDXW-10 and further transformed into M8-2, giving HYP2M. Fed-batch fermentation of HYP2M was performed in a 7.5-liter bioreactor (fig. S5). Generally, 0.8 g liter−1 of 4-Hyp was produced in 6 hours. The final yield of 4-Hyp reached 54.8 g liter−1, which was 1.27-fold higher than that of HYP2. Furthermore, HYP2M showed the lowest glucose consumption, leading to the highest yield of 0.236 g g−1. l-Pro production in HYP2M was eventually decreased to a final concentration of 0.2 g liter−1 in 60 hours. Park et al. (5, 30) reported that recombinant E. coli with datp4h catalyzed 4-Hyp with l-Pro as substrate in a yield of 25 g liter−1, with glycerol and glucose (41 g liter−1) as carbon sources, respectively. Compared with previous results, our reported 4-Hyp production method gave the highest yield.


In summary, carbon flux was redirected through rare codon selection evolution, dynamic modulation of precursors, and metabolic engineering to improve 4-Hyp production from glucose. First, using rare codon selection–based in vivo evolution, high producers of l-Pro from glucose as substrate were screened. Metabolic engineering was then conducted to capture l-Pro by deleting putA, proP, putP, and aceA and mutating proB to concentrate carbon flow to l-Pro and release its feedback inhibition. The dynamic regulation of ODHC activity using a quorum sensing circuit on sucA shunted α-KG to 4-Hyp. Last, genome mining and rational design of engineered strain P4H was performed to promote 4-Hyp production to 54.8 g liter−1 in 60 hours with barely any l-Pro remaining. To our knowledge, this is the highest 4-Hyp production yield reported to date. This work provides a good technology platform for the preparation of hydroxyl amino acids from glucose. Further work will be conducted to demonstrate the general applicability of hydroxylation to other protein composition amino acids using α-KG–dependent hydroxylases, such as l-Ile/l-Leu/l-Met by IDO, l-Asn/l-Asp by AsnO/Ask, and l-arginine by VioC, for secondary metabolite biosynthesis (31).



All chemicals mentioned were purchased from Sinopharm Chemical Reagent Co. Ltd. (Beijing, China) if not otherwise specified. l-Pro and 4-Hyp were purchased from Sigma-Aldrich (Shanghai, China). Restriction endonucleases were purchased from Takara Bio Inc. (Dalian, China). ClonExpress II One Step Cloning Kit and Phanta Max (p515) DNA polymerases were purchased from Vazyme Biotech Co. Ltd. (Nanjing, China). Oligonucleotides were synthesized by GENEWIZ Bio Inc. (Suzhou, China).

Strains, plasmids, and media

The strains and plasmids involved in this study are listed in table S3. The MP6 and CRISPR systems for gene editing were purchased from Addgene. kanR-H, kanR-A, ampR-H, ampR-A, esaI, esaR, and Datp4h were synthesized by GENEWIZ Bio Inc. (Suzhou, China). E. coli BL21 and S. marcescens JNB5-1 (S. marcescens JNB5-1) were cultured in Lysogeny Broth (LB) for vector construction and 0.2× LB for selection. Apramycin (50 μg ml−1), kanamycin (25 μg ml−1), or streptomycin (50 μg ml−1) was supplemented in LB when needed.

Construction of plasmids and recombinant strains

The primers used in this study are listed in table S4. Vectors were constructed through homologous recombination using ClonExpress II One Step Cloning Kit and transformed into E. coli by heat shock or electroporation. The Datp4h gene (accession number ANH21194.1) and esnapd2 (accession number AGS49339.1) were amplified from pUC19-Datp4h and pUC19-esnapd2 using Tp4h-F and Tp4h-R as well as E2-F and E2-R, which were subsequently cloned into pDXW-10, generating pDXW-10-Datp4h and pDXW-10-E2, respectively. Point mutations on esnapd2 were proceeded with corresponding primers. Similar to kanR-H and kanR-A, which were amplified by kan-1 and kan-3 from pUC19-kanR-H and pUC19-kanR-A, respectively, apmR, apmR-H, and apmR-A were amplified from pUC19-apmR, pUC19-apmR-H, and pUC19-apmR-A using apm-1 and apm-2, respectively. Then, kanR-H and kanR-A were subcloned into vector fragments amplified by kan-2 and kan-4 from pET28a, while apmR, apmR-H, and apmR-A were subcloned into vector fragments amplified by apm-2 and apm-4, giving pET28a-H, pET28a-A, pET28a-AW, pET28a-Ha, and pET28a-Aa, respectively. pET28a, pET28a-H, pET28a-A, pET28a-Wa, pET28a-Ha, pET28a-Aa were transformed into the BL21 to generate EKW, EKH, EKA, EAW, EAH, and EAA, respectively. In addition, pET28a-Wa, pET28a-Ha, and pET28a-Aa were transformed into S. marcescens JNB5-1, generating SW, SH, and SA, respectively. pTargetF and pCas were used for gene editing on the E. coli genome (32), while single guide RNA (sgRNA) was designed by sgRNAcas9 (33). pTarget-putA, pTarget-putP, pTarget-proP, pTarget-aceA, pTarget-proB, and pTarget-PeaS-sucA were amplified by PutAsg-F and PutAsg-R, PutPsg-F and PutPsg-R, ProPsg-F and ProPsg-R, AceAsg-F and AceAsg-R, ProBsg-F and ProBsg-R, and SucAsg-F and SucAsg-R from pTarget and were then constructed through homologous recombination. The gfp gene, followed by the degradation tag driven by PeaS, was amplified using primers GFP-F and GFP-R before being recombined into pDXW-10, giving pDXW-10-PeaS-gfpdt. The pSC101 origin of replication (ori) was then amplified from pCas by MP6ts-1 and MP6ts-3 and recombined into MP6, giving temperature-sensitive MP6ts.

The isolated mutant of E. coli PM-14 selected by rare codon selection was identified as having the best performance with regard to l-Pro biosynthesis, the strain with pET28a-A eliminated through continuous passage was named M1. M1/pCas was then generated by transforming PM14 with the pCas vector. Homologous arms upstream and downstream of putA were amplified by putA-1 and putA-2 as well as putA-3 and putA-4 primers, respectively. They were combined together through overlap PCR to generate the homologous arm putALR, which was then transformed into M1/pCas along with pTargetF-putA. After 12 hours of cultivation at 30°C, the surviving positive transformant in an LB plate including both kanamycin and streptomycin was identified by genome PCR for putA deletion. pTargetF-putA was eliminated by adding 0.1 mM isopropyl-β-d-thiogalactopyranoside (IPTG) in the culture, while pCas was eliminated in the cultivation at 42°C. The resultant stable strain after three subsequent generations was named M2. Similar methods were used to build M3 (M2∆putP), M4 (M3∆proP), M5 (M4∆aceA), and M6 (M5proB*) with corresponding pTargetF and primers. esaRI70V was cloned into putP loci in M6, generating M6R. esaI driven by Pbs promoters was substituted into putA loci, giving M7-1, M7-2, M7-3, and M7-H, respectively. The sucA promoter was then replaced by PeaS into these strains, giving M8-1, M8-2, M8-3, and M8-H, respectively, where pDXW-10-Datp4h was finally transformed to generate HYP1, HYP2, HYP3, and HYPH, respectively; HYPA (M6R harboring pDXW-10-Datp4h) served as a control in this group. Sequences upstream and downstream of pigI (pigIL and pigIR) were amplified by PigI-Delete-Up-F and PigI-Delete-Up-R as well as PigI-Delete-Down-F and PigI-Delete-Down-R, before they were combined with apmR through overlap PCR. The resultant pigIL-apmR-pigIR was subsequently subcloned into pUTmini Tn5-Km, giving pXW1805. pXW1805 was transformed into E. coli S17-1, giving E. coli S17-1/pXW1805, which was mating with S. marcescens JNB5-1 for pigI deletion. The same method was used for LK-18∆pigI construction.

Screening l-Pro overproducers from mutation libraries

The ARTP mutation system (Beijing Bestsqy Biotechnology Co. Ltd.) was used to generate mutation libraries of S. marcescens JNB5-1, as it led to greater genome damage than traditional ultraviolet mutagenesis (34). High-purity helium was discharged in a high-frequency electric field to generate a plasma that would cause the diversiform damages to DNA structure and activate the SOS repair in cells, generating a variety of mismatch sites in genome. The wild-type S. marcescens JNB5-1 grown to OD600 at 0.2 to 0.4 in 1× LB was selected for ARTP treatment at 10 standard liters per minute and 100 W. Previously, 10 μl of the culture was dropped onto slides and then exposed to an ARTP jet for 10, 20, and 30 s, with fatal rates of 86.5, 96.1, and 99.3%, respectively. To prepare mutant libraries of S. marcescens JNB5-1, the wild-type strains treated with ARTP for at least 10 s were washed in 1.5-ml sterilized Eppendorf tubes containing 800 μl of LB medium and were incubated at 37°C for 1 to 2 hours, generating the mutation libraries of S. marcescens JNB5-1. For preparation of libraries of E. coli mutants, MP6ts was previously transformed into E. coli BL21 to generate E. coli/MP6ts. E. coli/MP6ts was then cultured in LB at 30°C with 0.1 mM arabinose to induce mutation for 24 hours. The strain was then cultured at 42°C for elimination of MP6ts. The resultant culture was guaranteed to be MP6ts free, which showed growth inhibition in the LB medium plus corresponding antibiotic. Both libraries of S. marcescens JNB5-1 and E. coli BL21 were made into competent cells and transformed with vectors pET28a-Aa and pET28a-A for l-Pro overproducer selection, respectively. The cells exhibiting superior growth in 0.2× LB (with corresponding antibiotic) were selected and inoculated into 50 ml of TY medium for 24 hours. l-Pro concentration in these cultures was measured by high-performance liquid chromatography (HPLC).


For flask cultivations, cells stored by freezing at −80°C were cultured in LB for activation at 37°C and 210 rpm overnight with 25 μg ml−1 kanamycin. The culture (0.5 ml) was then transferred to a 500-ml flask containing 50 ml of TY medium. For further fed-batch fermentation, the culture (OD600, 3 to 8) was inoculated in a 7.5-liter bioreactor that contained 3 liters of the fermentation medium: yeast extract (8 g liter-1), peptone (12 g liter-1), NaCl (3 g liter-1), citric acid (2.1 g liter-1), (NH4)2SO4 (2.5 g liter-1), and glucose (20 g liter-1), feeding with 15 ml of trace metal solution: FeSO4·7H2O (10 g liter-1), ZnSO4·7H2O (1 g liter-1), CoCl2·6H2O (2.5 g liter-1), MnCl2·4H2O (15 g liter-1), CuCl2·2H2O (1.5 g liter-1), and H3BO3 (3 g liter-1). The culture, with an initial OD600 of less than 0.3, was fermented at pH 7.0, 37°C, and 2.0 liter min−1 of airflow. Dissolved oxygen level was maintained at 40% through automatic control of the agitation speed varying from 600 to 1000 rpm. The pH was regulated by the addition of 30% (w v−1) of NH3·H2O. Glucose [80% (w v−1)] was added to maintain a low residual concentration of 10 g liter−1.

Transcriptome analysis

E. coli BL21 and PM-14 were collected while grown to an OD600 of 0.8 and were then harvested by precipitate freezing in liquid nitrogen before being delivered in dry ice to Vazyme Biotech Co. Ltd. (Nanjing, China) for transcriptome analysis. RNA was then extracted from targeted strains while internal rRNA inside was removed using a Ribo-Zero Kit. Purified ribosomal-depleted RNA was fragmented and used as a template to synthesize complementary DNA (cDNA), which was subsequently subjected to end reparation, adenine addition, and adaptor ligation for PCR amplification. The resultant cDNA libraries were sequenced using an Illumina HiSeq 2000 sequencer, with a reference genome of E. coli BL21 (GenBank ID: AM946981.2). A total of 10,146,936 reads matched to the referenced wild-type genome, while 9,330,264 reads corresponded to PM-14. The differentially expressed genes were determined using standards of the false discovery rate of ≤0.001, fold change |log2ratio| ≥1 between wild type and PM-14.

Real-time quantitative polymerase chain reaction

Both mutants PM-14 and LK-18, as well as their wild types, were harvested at their logarithmic phases. Total RNA was extracted using a bacterial RNA extraction kit. After reverse transcription using HiScript III RT SuperMix for qPCR (+gDNA wiper), selected RNAs were amplified using the AceQ Universal SYBR qPCR Master Mix with indicated primers (table S4), using 16S rRNA as the internal control in the StepOnePlus system, with SYBR Green I detection. qPCR was performed with an initial denaturing at 95°C for 5 min, followed by 40 cycles of 10 s at 95°C, and 30 s at 60°C. Melting curve analyses were performed for 15 s at 95°C, 60 s at 60°C, and 15 s at 95°C. PCR products of the targeted gene samples were measured in triplicate. Data were analyzed according to the 2−∆∆CT method (35). A similar method was used for qRT-PCR analysis of the tRNAs of l-Pro that were previously deacylated as reported (36), with 5S rRNA as an internal control.

Enzyme assay

Crude cell extracts were used for measurements of enzymes involved. Cells cultured for 24 hours were harvested by centrifugation at 4°C and 10,000 rpm for 10 min. They were then washed twice with tris-HCl buffer (0.1 M, pH 7.0) before sonication for 15 min on ice, after which the sonicated mixture was centrifuged at 10,000 rpm for 30 min at 4°C to remove cell debris. The supernatant was filtered and used for an enzyme activity assay. For detection of the activity of TP4H, 0.1 ml of the crude extract was added into the reaction mixture composed of 20 mM l-Pro, 20 mM α-KG, 1 mM FeSO4·7 H2O as well as 5 mM ascorbic acid in Hepes buffer (50 mM, pH 7.0) at 30°C for 30 min. HPLC was used to detect 4-Hyp after the reaction. To evaluate the activity of ODHC, cells of HYPA, HYP1, HYP2, HYP3, and HYPH were sampled at 0, 6, 12, 18, and 24 hours during fermentation, respectively. The activity of ODHC in 0.1 ml of these extracts was assessed by a previously reported method (37), measuring the initial rate of NADH (reduced form of nicotinamide adenine dinucleotide) absorption at 365 nm using a microplate reader (Biotek Instruments Inc., USA). Protein concentration of the crude extract was determined by the Bradford method with bovine serum albumin used as a control.

Analytical procedure

Targeted cells were inoculated in TY and/or fermentation medium until the end of fermentation. The culture was centrifuged at 10,000 rpm for 10 min and was appropriately diluted before detection of the total l-Pro production secreted in the culture medium as well as other amino acids. The fermental supernatant was measured by precolumn derivatization. Briefly, 200 μl of sample supernatant was properly diluted before being mixed with 100 μl of phenylisothiocyanate (1.25% v/v) and 100 μl of triethylamine (14% v v−1) dissolved in acetonitrile for 1 to 2 hours. After derivatization, 400 μl of n-hexane was added to the mixture, of which the lower aqueous phase was drained into a 250 × 4.6 mm Platisil 5-μm octadecylsilane column. Samples were gradiently eluted by mobile phase A [46.3 mM sodium acetate and 7% acetonitrile (pH 6.5)] and mobile phase B (80% acetonitrile v v−1) with 1 ml min−1 as follows: 0 to 2 min, A 100%; 2 to 15 min, A 100 to 88%; 15 to 25 min, A 88 to 70%; 25 to 32 min, A 70 to 50%; 32 to 33 min, A 50 to 0%; 33 to 38 min, A 0 to 100%. Amino acids were detected by measurement of ultraviolet absorption at 254 nm.

Structure modeling and analysis

The Protein Fold Recognition Server (Phyre2) ( was used for prediction of the three-dimensional structure model (38). All MD simulations were performed using GROMACS version 2018.1 with the AMBER force field. Enzymes were placed in periodic dodecahedrons with a padding of 12 Å filled with transferable intermolecular potential with 3 points (TIP3P) water. After steepest-descent energy minimization, 2-ns Canonical ensemble (NVT) simulations were run at 298 K in 2-fs steps. In the end, 30-ns simulations were run at 298 K in 2-fs steps. Cartesian_ddg derived from Rosetta was used for calculation of ΔΔG between enzyme and product (39).

Statistical analysis

Each experiment was conducted at least twice, with independently a minimum of three biological replicates. The error bar represented SEM calculated by the function STDEV in Microsoft Excel. The statistical analysis was performed using GraphPad Prism 5.0, and the statistical significance was set at P < 0.05.


Supplementary material for this article is available at

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: Funding: This work was supported by National Key Research and Development Program of China (2018YFA0900300) and the National Natural Science Foundation of China (31870066), Project funded by the China Postdoctoral Science Foundation (2016 T90421), the Fundamental Research Funds for the Central Universities (JUSRP51708A), Project funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions, the 111 Project (111-2-06), National First-Class Discipline Program of Light Industry Technology and Engineering (LITE2018-06), Key Research and Development Program of Ningxia Hui Autonomous Region (2019BCH01002), and Jiangsu province “Collaborative Innovation Center for Modern Industrial Fermentation” industry development program and the science and technology innovation team foundation of Ningxia Hui Autonomous Region (KJT2017001). Author contributions: M.L., M.X., and Z.R. conceived and designed the experiments. M.L., Z.M., and X.P. performed the construction of strains. J.Y., M.H., and Y.S. performed fermentation of l-Pro and 4-Hyp. M.L. performed MD simulations as well as Rosetta computation. T.Y. and X.Z. analyzed the data. M.L. wrote the paper. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.
View Abstract

Stay Connected to Science Advances

Navigate This Article