Oxidative Folding of Proteins in Escherichia Coli

May 1, 2012
Mehmet Berkmen
Volume 25, Issue 5

The author highlights novel strains and methods that have recently been shown to express multidisulfide bonded proteins.

Expression of active recombinant proteins in E. coli which require post-translational modifications, such as disulfide bonds, is difficult, mainly due to the fact that disulfide bond formation in E. coli is compartmentalized to the periplasm and does not have the capacity to express complex multidisulfide bonded eukaryotic proteins. Novel expression strains and procedures are in demand to handle the growing pharmaceutical and biotechnological field. The author highlights novel strains and methods that have recently been shown to express multidisulfide bonded proteins.

Successful over-expression of a heterologous protein in the conventional prokaryotic host Escherichia coli is a highly unpredictable process and remains to be a major bottleneck for the biotechnological industry (1). Upon completion of translating the polypeptide from mRNA, many proteins require additional post-translational modifications such as phosphorylation, glycosylation, ubiquitination, S-nitrosylation, methylation, N-acetylation, lipidation, and proteolysis just to name a few. These modifications are catalyzed by a set of dedicated enzymes which need to be regulated both in space (different compartments of the cell) and time. Further chaperoning by a dedicated set of proteins may still also be required for a protein to achieve its correctly folded active state (2).


One major form of post-translational modification is the formation of covalent disulfide bonds. Disulfide bonds are more common than appreciated. After the peptide bond, disulfide bonds are the second most common covalent bonds found within proteins (3). It is estimated that one-third of the human proteins reside in the endoplasmic reticulum, of those at least half of them are predicted to have disulfide bonds (3). When expressing an open reading frame (ORF) from an uncharacterized organism or unknown sources (as in the case of environmental DNA libraries), a researcher is challenged to find the correct expression host and condition to express soluble active protein to a satisfactory yield. When attempting to express a protein which requires disulfide bonds for its folding, a sufficient understanding on the mechanism of disulfide bond formation and its subsequent biological roles is essential. This review will attempt to summarize the necessary knowledge to assist the researcher in finding the correct conditions to express a disulfide bonded protein, within the model prokaryotic host E. coli.


Disulfide bonds are formed by the oxidation of thiol groups (SH) found within the side-chains of cysteines. Disulfide bonds play multiple critical roles in proteins stability, function and can be summarized into three major biological groups; structural, signaling, and catalytic.

Redox state of cysteines

Cysteines involved in the formation of structural disulfide bonds decrease the entropy of a protein by restricting conformational possibilities, increasing the proteins thermostability (4). It is therefore possible to create more stable versions of a protein by engineering disulfide bonds into the proteins sequence (5). Not surprisingly, the stabilizing property of disulfide bonds is most likely the reason why secreted proteins, which are outside the chaperone rich environment of the cytoplasm, are rich in disulfide bonds. However, cysteines are uniquely sensitive to their environment and can be readily reduced or oxidized depending on the redox state of their surroundings (6). This feature has been used by many proteins to sense, signal, and regulate the redox state of their environment.

For example, the transcriptional factor OxyR has two redox sensitive cysteines which upon oxidation promote a conformational change, resulting in the activation of OxyR as a transcriptional factor (7). Other signaling disulfide bonds can be found in the two-component signal transduction system of ArcAB and oxidative stress response of RsrA (8, 9). Catalytic cysteines are crucial to the activity of oxidoreductases and are found within the CxxC active site motif, where x is any amino acid. The active site cysteines of reductases such as the cytoplasmic thioredoxin are maintained reduced, whereas those of oxidases such as the periplasmic DsbA are maintained oxidized (10, 11).

Predicting disulfide bonds

When expressing a protein, it is crucial to know whether the protein requires disulfide bonds in order to achieve its final correctly folded state. However, predicting the presence of disulfide bonds is not easy. Although there have been attempts at building computer algorithms which claim to predict the presence of disulfide bonds, currently no such service is in popular use (12–14). Instead, there are two important biological clues which can assist the researcher in predicting whether their protein of interest should be expressed in an oxidative environment which promotes the formation of disulfide bonds.

Numbers and conservation of cysteines

Due to their chemical reactivity, cysteines are one of the rarest amino acids used within polypeptides. A cysteine within a protein is less likely be due to random mutation, instead is likely selected for its function. This is usually the case for cysteines which are highly conserved, especially if the conservation is within proteins from diverse organisms. Thus, highly conserved cysteines are indicative of important function and may require the protein to be expressed in an oxidative environment. Furthermore, structural disulfide bonds occur within pairs of cysteines, resulting in a strong tendency for cysteines to occur in even numbers and less often to be found in odd numbers. This phenomenon was used to analyze the cysteine content of various prokaryotic and archeal genomes, which allowed for predictions on the redox state of their proteome (15, 16). Thus, a protein with even numbered cysteines is a signature for a disulfide bonded protein.

Extra-cytoplasmic location of protein

Currently, all known prokaryotic and eukaryotic disulfide bond forming machinery is localized to compartments outside the cytoplasm. A protein which requires disulfide bonds for its folding will therefore require a secretion signal. There are several web-based algorithms which can predict the presence of a signal peptide to high levels of confidence (17, 18). It is therefore possible to assess whether the destination of the protein is the ER or the periplasm, where disulfide bond formation can occur. However, care must be taken as not all periplasms are oxidative.

For example the periplasm of the anaerobic Bacteroides is predicted to be a reducing compartment (19, 20). Taken together, a protein predicted to have a signal peptide with conserved even numbered cysteines is a likely candidate to require disulfide bonds in order to achieve its final folded state.

When expressing a disulfide bonded protein, the nature of the disulfide bonds must be considered. Although it is difficult to predict which role the disulfide bond plays in the protein, generally expressing the protein in a compartment which promotes the formation of stable disulfide bonds is suggested. This approach is due to the importance of structural disulfide bonds in the folding of proteins whereas signaling and catalytic disulfide bonds have much less roles in the correct folding of proteins. However, signaling and catalytic disulfide bonds are biologically active and may have toxic effects on the host strain when over-expressed. In these cases, expressing the protein in an alternate compartment (periplasmic or cytoplasmic) or a different redox environment (aerobic versus anaerobic) should be considered.


The cytoplasm of E. coli is not permissive to the formation of stable disulfide bonds due to the presence of numerous reductases which efficiently reduce any disulfide bond formed. Disulfide-bond formation is therefore compartmentalized extra-cytoplasmically to the periplasm (see Figure 1). A set of cell envelope proteins (named Dsb for disulfide bond) which are responsible for the formation and correction of disulfide bonds has been studied in great detail in the last two decades and several comprehensive reviews have been written on this subject (21–24). This section will attempt to give a brief summary of these findings.

Figure 1

Periplasmic disulfide bond formation

In E. coli, disulfide bond formation is catalyzed by periplasmic oxidase DsbA. DsbA is monomeric 21 kD protein containing the classic thioredoxin fold (25) and a single catalytic disulfide bond in its active site Cys-Pro-His-Cys (26). Upon donating its active site disulfide bond to a reduced substrate protein, DsbA becomes reduced and is re-activated to its oxidized state by the inner membrane protein DsbB (27). DsbB transfers the electrons it has received from DsbA to the pools of quinones within the inner membrane (28). Of the studied disulfide bond oxidases, DsbA is one of the most efficient oxidase, capable of quickly oxidizing a protein as it enters the periplasm (29, 30). This can result in mis-oxidation of substrate proteins, especially if the protein contains multiple non-consecutive disulfide bonds (31). A mis-oxidized and therefore misfolded protein can be proteolytically degraded and removed by the periplasmic protease DegP [32]. However, this is an energetically expensive solution. A more elegant solution would be to correct the mis-folded protein by isomerizing the disulfide bonds until the protein achieves its correctly oxidized disulfide bonded state.

Disulfide bond isomerization is critical to the efficient folding of non-consecutive, multidisulfide bonded proteins. Close to half a century has passed since the discovery of the eukaryotic protein disulfide bond isomerase (PDI) (33). Surprisingly, the in-vivo mechanism of disulfide bond isomerization remains elusive. PDI remains to be shown as a disulfide bond isomerase in vivo and the exact mechanism of disulfide-bond isomerization by PDI or its prokaryotic functional homolog DsbC has yet to be understood clearly.

DsbC is a periplasmic homo-dimeric disulfide bond isomerase, capable of converting a mis-oxidized protein back to its correctly oxidized state, both in vitro and in vivo [31, 34–38]. The crystal structure of DsbC shows a "V" shaped protein, where each arm of the V is a single monomer (see Figure 2) (39). Each monomer consists of a thioredoxin domain and a dimerization domain, joined together by a short alpha-helical linker domain. Each monomer of DsbC has four cysteines. The carboxyl-terminal cysteine pair forms a structural disulfide bond which is important for the folding and stability of DsbC (40). The amino-terminal cysteine pairs are part of the active site CxxC motif and are maintained in their reduced state by the inner membrane protein DsbD (41).

Figure 2

DsbD receives its reducing potential from the cytoplasmic pool of NADPH via the thioredoxin pathway (42). The dimerization of the two monomers of DsbC results in a hydrophobic cleft 38 Å wide.

Together with the flexibility incurred by the linker domain, the uncharged cleft should be able to accommodate a large set of proteins or mis-folded domains. It has been assumed that this hydrophobic cleft is responsible for selectively interacting with mis-folded proteins whose hydrophobic core is exposed due to mis-oxidation. This notion is further supported by the ability of DsbC to assist in the folding of fully denatured non-disulfide bonded D-glyceraldehyde-3-phosphate dehydrogenase(GAPDH) (43). This chaperone property of DsbC is dependent on the dimerization of DsbC (44) and is independent of its redox active cysteines. However, no direct evidence has thus far been produced to show the role of the hydrophobic cleft in the ability of DsbC to bind and refold mis-folded proteins.

The periplasm of E. coli is ill adapted to high-level expression of multi disulfide bonded proteins. There are two major reasons for this. First, as the disulfide bond machinery is localized to the periplasm, the over-expressed protein needs to be efficiently exported usually via the sec system. Thus, the expressed protein needs to be maintained in its secretion-competent unfolded state, either naturally or with the assistance of SecB. However, the Sec apparatus is not adapted for the export of a highly-over expressed protein. This usually results in clogging of the sec apparatus which can lead to toxicity and commonly results in low yields. Second, the periplasmic disulfide bond forming machinery of E. coli is not adapted to folding multidisulfide bonded proteins. Of the ~1500 proteins predicted to be exported in E. coli, the significant majority (<85%) have only 0–2 cysteines with only 4% having more than 6 cysteines. Of those multi-cysteine proteins, most of the cysteines are involved in coordinating various cofactors, such as hemes for cytochromes or iron in iron-sulfur cluster proteins. This inability of E. coli to correctly oxidize multi-disulfide bonded proteins in the periplasm becomes apparent when eukaryotic proteins with multiple nonconsecutive disulfide bonds are over-expressed. For example yields of tissue plasminogen activator (tPA) are only detected when DsbC is over-expressed (45).

Furthermore, the periplasm is devoid of ATP and thus lacks ATP driven chaperones present in the cytoplasm. Thus, in comparison to the cytoplasm the small volume and the energy poor periplasmic compartment is not ideal for expressing and folding proteins to high yields.

Cytoplasmic disulfide bond formation

The formation of disulfide bonds is strongly disfavored in the cytoplasm of E. coli (46). This is mainly due to the presence of two reductive pathways, the glutaredoxin and the thioredoxin pathway (see Figure 3) (10). These two pathways maintain a set of reductases in their reduced state, which in turn maintains a set of their substrate proteins cysteines in their reduced state. The reducing potential of the two pathways are received from the cytoplasmic pool of NADPH. NADPH donates its electrons to thioredoxin reductase (trxB) and glutathione reductase (gor), which transfer the electrons ultimately to a set of reductases. It is however possible to form stable disulfide bonds in the cytoplasm. To permit the formation of stable disulfide bonds in the cytoplasm, the reducing power of the glutaredoxin and the thioredoxin pathways need to be diminished. This step can be achieved by knocking out the gor and trxB genes (10, 47). However trxB, gor mutant cells are nonviable as at least one essential protein (ribonucleotide reductase) needs to be maintained in a reduced state (48). Cell viability can be restored by selecting for a suppressor of the lethal phenotype. Such a suppressor was selected and the locus was mapped to a peroxidase named AhpC, which had mutated to lose its function as a peroxidase but had gained the capacity as a disulfide bond reductase (49). It was eventually shown that mutant AhpC* was able to complement the defect in the glutaredoxin pathway by reducing glutathionylated glutaredoxins, which was sufficient reducing power for the cells to gain viability (50).

Figure 3

However, in the absence of thioredoxin reductase, the two thioredoxins in E. coli accumulate in their oxidized forms enabling them to act as disulfide bond formation catalysts, in a reversal of their normal function (51). This final strain FÅ113 had the remarkable capacity to oxidize and form stable disulfide bonds in its cytoplasm (52). FÅ113 was eventually made commercially available under the name Origami (Novagen) and has been used to express numerous disulfide bonded proteins (53–58). In some cases, expression of disulfide bonded proteins in the cytoplasm of Origami was exceptionally successful. For example, collagen prolyl 4-hydrolases yield was ~20 times higher in the cytoplasm of Origami than in the periplasm of the corresponding BL21 wild type strain and 10 times better than when expressed in insect cells (59). However, the efficient formation of correct disulfide bonds is limited in this strain (52).

To enhance disulfide bond isomerization in the cytoplasm of trxB/gor strains, a new protein expression strain was recently engineered and is available under the name SHuffle (New England Biolabs, Ipswitch, MA). This strain encodes a cytoplasmic copy of dsbC, expressed from the strong ribosomal promoter rrnB. Even though DsbC is overexpressed in an oxidative cytoplasm, it is found predominantly in its hemi-reduced active state (data not shown). As this strain has been developed only recently, currently there is only one publication using this strain [60]. For certain disulfide bonded proteins, the disulfide bond isomerase activity of DsbC is essential for their folding. This is the case for the three cysteine containing chitinase from Plasmodium falciparum (see Figure 4). Both E. coli K12 and B versions of this strain were engineered and empirical evidence suggests that the B versions are generally better at production of proteins than the K12 strains.

Figure 4


It has been two decades since the discovery of enzymes responsible for the formation of disulfide bonds in E. coli (61). Although great progress in understanding the molecular mechanism of disulfide bond formation has been made, comparatively little progress has been achieved in engineering novel strains which can correctly express multidisulfide bonded proteins. This lack of progress is mainly due to the fact that a given solution for expressing a recombinant protein is usually not transferable to the next protein. Thus, for each protein the researcher must start from scratch to find the suitable expression strain and condition.

One bottleneck for the researcher is the limited number of strains available to express disulfide bonded proteins. Those few new strains along with tools and techniques to assist the researcher in expressing disulfide bonded proteins have been reviewed here. The author hopes that increased molecular understanding of disulfide bond formation will result in an increasing repertoire of novel strains that are capable of producing active soluble recombinant proteins.

MEHMET BERKMEN, PHD, is a staff scientist at New England Biolabs, 240 County RD, Ipswich, MA 01938-2723, tel. 978.380.7519, berkmen@neb.com.


1. F. Baneye and M. Mujacic, Nat. Biotechnol. 22 (11) 1399–408 (2004).

2. M.J. Kerner et al., Cell 122 (2) 209–20 (2005).

3. J.W. Wong, S.Y. Ho, and P.J. Hogg, Mol. Biol. Evol. 2010.

4. T. Zhang, E. Bertelsen, and T. Alber, Nat. Struct. Biol. 1 (7) 434–8 (1994).

5. O.R. Siadat, BMC Biochem 7 p. 12 (2006).

6. L. Debarbieux and J. Beckwith, Jrnl. Bacteriol. 182 (3) 723–7 (2000).

7. C. Lee et al., Nat. Struct. Mol. Biol. 11 (12) 1179–85 (2004).

8. F.A. Alvarez and D. Georgellis, Methods Enzymol. 471 p. 205–226 (2010).

9. J.G. Kang et al., Embo. Jrnl. 18 (15) 4292–8 (1999).

10. W.A. Prinz et al., Jrnl. Biol. Chem. 272 (25) 15661–7 (1997).

11. H. Kadokura, H. and J. Beckwith, Embo. Jrnl. 21 (10) 2354–63 (2002).

12. H.H. Lin and L.Y. Tseng, Nucleic Acids Res., 38 p. W503–7 (2010).

13. R. Singh, Brief Funct. Genomic Proteomic. 7 (2) 157–72 (2008).

14. F. Ferre and P. Clote, Nucleic Acids Res., 33 p. W230–2 (2005).

15. M. Beeby et al., PLoS Biol. 3 (9) e309 (2005).

16. R.J. Dutton et al., Proc. Natl. Acad. Sci. USA 105 (33) 11933–8 (2008).

17. O. Emanuelsson et al., Nat. Protoc. 2 (4) 953–71 (2007).

18. L. Kal, A. Krogh, and E.L. Sonnhammer, Nucleic Acids Res., 35 p. W429–32 (2007).

19. M.A. Reott et al., Jrnl. Bacteriol. 191 (10) 3384–91 (2009).

20. S.R. Shouldice et al., Mol. Microbiol. 75 (1) 13–28 (2010).

21. M. Berkmen, D. Boyd, and J. Beckwith, "Disulfide Bond Formation in the Periplasm," in The Periplasm, M. Ehrmann, Ed. (ASM Press, Washington, DC, 2006). pp. 122–140.

22. F. Hatahet and L.W. Ruddock, Antioxid, Redox Signal 11 (11) 2807–50 (2009).

23. M. Depuydt, J. Messens, and J.F. Collet, Antioxid. Redox. Signal 15 (1) 49–66 (2011).

24. H. Kadokura and J. Beckwith, Antioxid. Redox Signal, 2010.

25. J.L. Pan and J.C. Bardwell, Protein Sci. 15 (10) 2217–27 (2006).

26. U. Grauschopt et al., Cell 83 (6) 947–55 (1995).

27. G. Jander, N.L. Martin, and J. Beckwith, Embo. Jrnl. 13 (21) 5121–7 (1994).

28. M. Bader et al., Cell 98 (2) 217–27 (1999).

29. M. Huber-Wunderlich and R. Glockshuber, Fold. Des. 3 (3) 161–71 (1998).

30. H. Kadokura and J. Beckwith, Cell 138 (6) 1164–73 (2009).

31. M. Berkmen, D. Boyd, and J. Beckwith, Jrnl. Biol. Chem . 280 (12) 11387–94 (2005).

32. O. Subrini and J.M. Betton, FEMS Microbiol Lett. 296 (2) 143–8 (2009).

33. R.F. Goldberger, C.J. Epstein, and C.B. Anfinsen, Jrnl. Biol. Chem. 238 p. 628–35 (1963).

34. A. Zapun et al., Biochemistry 34 (15) 5075–89 (1995).

35. J.F. Collet et al., Jrnl. Biol. Chem. 277 (30) 26886–92 (2002).

36. A. Hiniker and J.C. Bardwell, Jrnl. Biol. Chem. 279 (13) 12967–73 (2004).

37. V.E. Shevchik et al., Mol. Microbiol. 16 (4) 745–53 (1995).

38. A. Rietsch et al., Proc. Natl. Acad. Sci. USA 93 (23) 13048–53 (1996).

39. A.A. McCarthy et al., Nat. Struct. Biol. 7(3) 196–9 (2000).

40. X. Liu and C.C. Wang, Jrnl. Biol. Chem. 276 (2) 1146–51 (2001).

41. A. Rietsch et al., Jrnl. Bacteriol. 179 (21) 6602–8 (1997).

42. E.J. Stewart, F. Katzen, and J. Beckwith, Embo. Jrnl. 18 (21) 5963–71 (1999).

43. J. Chen et al., Jrnl. Biol. Chem. 274 (28) 19601–5 (1999).

44. X.X. Sun and C.C. Wang, Jrnl. Biol. Chem. 275 (30) 22743–9 (2000).

45. J. Qiu, J.R. Swartz, and G. Georgiou, Appl. Environ. Microbiol. 64 (12) 4891–6 (1998).

46. A. I. Derman and J. Beckwith, Jrnl. Bacteriol. 173 (23) 7719–22 (1991).

47. A.I. Derman et al., Science 262 (5140) 1744–7 (1993).

48. S. Gon M.J. Faulkner, and J. Beckwith, Antioxid. Redox Signal 8 (5–6) 735-742 (2006).

49. D. Ritz et al., Science 294 (5540) 158–60 (2001).

50. Y. Yamamoto et al., Mol. Cell 29 (1) 36–45 (2008).

51. E.J. Stewart, F. Aslund, and J. Beckwith, Embo. Jrnl. 17 (19) 5543–50 (1998).

52. P.H. Bessette et al., Proc. Natl. Acad. Sci. USA 96 (24) 13703–8 (1998).

53. M. Kumano-Kuramochi et al., Jrnl. Biochem. 143 (2) 229–36 (2008).

54. C. Drees et al., Protein Express. Purif. 59 (1) 47–54 (2008).

55. Y. Xu, D. Lewis, and C.P. Chou, Appl. Microbiol. Biotechnol. 79 (6) 1035–44 (2008).

56. M.W. Larsen, U.T. Bornscheuer, and K. Hult, Protein Express. Purif. 62 (1) 90–7 (2008).

57. M. Kaomek et al., Biosci. Biotechnol. Biochem. 67 (4) 667–76 (2003).

58. M. Bar et al., Protein Express. Purif. 48 (2) 243–52 (2006).

59. A. Neubauer, P. Neubauer, and J. Myllyharju, Matrix Biol. 24 (1) 59–68 (2005).

60. A.R. Tait and S.K. Straus, Microb. Cell. Fact. 10 p. 51 (2011).

61. J.C. Bardwell, K. McGovern, and J. Beckwith, Cell 67 (3) 581–9 (1991).