Predicting disulfide bonds
When expressing a protein, it is crucial to know whether the protein requires disulfide bonds in order to achieve its final
correctly folded state. However, predicting the presence of disulfide bonds is not easy. Although there have been attempts
at building computer algorithms which claim to predict the presence of disulfide bonds, currently no such service is in popular
use (12–14). Instead, there are two important biological clues which can assist the researcher in predicting whether their
protein of interest should be expressed in an oxidative environment which promotes the formation of disulfide bonds.
Numbers and conservation of cysteines
Due to their chemical reactivity, cysteines are one of the rarest amino acids used within polypeptides. A cysteine within
a protein is less likely be due to random mutation, instead is likely selected for its function. This is usually the case
for cysteines which are highly conserved, especially if the conservation is within proteins from diverse organisms. Thus,
highly conserved cysteines are indicative of important function and may require the protein to be expressed in an oxidative
environment. Furthermore, structural disulfide bonds occur within pairs of cysteines, resulting in a strong tendency for cysteines
to occur in even numbers and less often to be found in odd numbers. This phenomenon was used to analyze the cysteine content
of various prokaryotic and archeal genomes, which allowed for predictions on the redox state of their proteome (15, 16). Thus,
a protein with even numbered cysteines is a signature for a disulfide bonded protein.
Extra-cytoplasmic location of protein
Currently, all known prokaryotic and eukaryotic disulfide bond forming machinery is localized to compartments outside the
cytoplasm. A protein which requires disulfide bonds for its folding will therefore require a secretion signal. There are several
web-based algorithms which can predict the presence of a signal peptide to high levels of confidence (17, 18). It is therefore
possible to assess whether the destination of the protein is the ER or the periplasm, where disulfide bond formation can occur.
However, care must be taken as not all periplasms are oxidative.
For example the periplasm of the anaerobic Bacteroides is predicted to be a reducing compartment (19, 20). Taken together,
a protein predicted to have a signal peptide with conserved even numbered cysteines is a likely candidate to require disulfide
bonds in order to achieve its final folded state.
When expressing a disulfide bonded protein, the nature of the disulfide bonds must be considered. Although it is difficult
to predict which role the disulfide bond plays in the protein, generally expressing the protein in a compartment which promotes
the formation of stable disulfide bonds is suggested. This approach is due to the importance of structural disulfide bonds
in the folding of proteins whereas signaling and catalytic disulfide bonds have much less roles in the correct folding of
proteins. However, signaling and catalytic disulfide bonds are biologically active and may have toxic effects on the host
strain when over-expressed. In these cases, expressing the protein in an alternate compartment (periplasmic or cytoplasmic)
or a different redox environment (aerobic versus anaerobic) should be considered.