Fusion Tags for Protein Expression and Purification

Published on: 
, , ,
BioPharm International, BioPharm International-06-02-2008, Volume 2008 Supplement, Issue 5

Gene fusion tags can improve the yield and solubility of many recombinant proteins. This article discusses the most popular fusion tags and the proteases used to remove them, with special reference to recently introduced technologies.


When expressing and purifying large quantities of soluble protein, expression difficulties often include poor yield and the formation of insoluble aggregates. Gene fusion technologies can overcome these obstacles and simplify purification and improve solubility. This article discusses the most popular fusion tags and the enzymes used to remove them, with special reference to recently introduced technologies.

Recombinant proteins show large variability in terms of their expression, solubility, stability, and functionality, making them difficult targets for large-scale analyses and production. Advances in recombinant protein expression include the development of better expression systems and host strains, improving mRNA stability, host-specific codon optimization, the use of secretory pathways, post-translational modification, co-expression with chaperones, and decreasing the amount of proteolytic degradation. However, no other technology has been as effective in improving the expression, solubility, and production of biologically active proteins as the addition of fusion tags, especially for difficult-to-express proteins.

Digital Vision/Getty Images

Genetically engineered fusion tags allow the purification of virtually any protein without any prior knowledge of its biochemical properties.1–2 They can improve the variable yield and poor solubility of many recombinant proteins. Proper design and judicious use of the right fusion tag can enhance the solubility and promote proper folding of the protein of interest, leading to recovery of more functional protein. On the other hand, adding fusion tags has been reported to result in changes in protein conformation, poor yields, loss or alteration of biological activity, and toxicity of the target protein. For this reason, it is desirable to remove the tag from the target protein after expression. When designing a fusion tag, therefore, careful consideration must be given to how the tag will be removed to produce native proteins without any extraneous sequences. Cleavage of the tags and the proteases that are used to cleave the tag are discussed in the next section.

Many fusion tags are available for the expression and purification of proteins (Table 1). These tags can be broadly classified into two categories: affinity tags that aid in purification but do not enhance the solubility of the proteins substantially, and solubility-enhancing tags that specifically enhance the solubility and recovery of functional proteins.

Table 1. Commonly used fusion tags for purification and enhancing the solubility of proteins

Affinity Tags

Affinity tags are the most commonly used tag for aiding in protein purification. They can be defined as exogenous amino acid (aa) sequences that bind with high affinity to a chemical ligand or an antibody. Most affinity tags are short peptide sequences that either bind to a ligand linked to a solid support (like the His tag) or contain an epitope recognized by immobilized antibodies (like the FLAG or Myc tags). The high affinity of these tags for their ligands and the availability of well developed immobilized supports for capturing the fusion proteins allow the protein of interest to be purified to a very high degree. Because of their small size, these affinity tags can be added at either end of the protein or in a region that is exposed to the surface. However, these tags generally do not increase the expression of the fusion proteins or enhance their solubility, and therefore are of little use in purifying hard-to-express proteins.

His-tags are the most widely used affinity tags. The purification of his-tagged proteins is based on the use of a chelated metal ion as an affinity ligand; one commonly used ion is the immobilized nickel-nitrilotriacetic acid chelate [Ni–NTA], which is bound by the imidazole side chain of histidine. Similarly, Streptag II, which consists of a streptavidin-recognizing octapeptide (WSHPQFEK), can be purified by affinity using a matrix with a modified streptavidin and eluted with a biotin analog. Other commonly used affinity tags like FLAG, Myc, and HA can be purified by binding to respective antibodies immobilized on chromatographic supports.


Because it is desirable to remove most tags at the end of the purification process, considerable advances have been made in design of affinity tags so that they can be cleaved without leaving any residues behind and also to simplify the entire process of purification and cleavage. One such system is the "Profinity eXact" fusion-tag system (Bio-Rad, Hercules, CA), which uses an immobilized subtilisin protease to carry out affinity binding and tag cleavage. The protease is not only involved with the binding and recognition of the tag, but upon application of the elution buffer, it also serves to precisely cleave the tag from the fusion protein directly after the cleavage recognition sequence. This delivers a native, tag-free protein in a single step. Another system for simple purification of proteins is based on elastin-like polypeptides (ELP) and intein. ELP consist of several repeats of a peptide motif that undergo a reversible transition from soluble to insoluble upon temperature upshift. The fusion protein is purified by temperature-induced aggregation and separation by centrifugation, and intein is used for tag removal.3 No affinity columns are needed for initial purification.

Solubility-Enhancing Tags

Solubility-enhancing tags are generally large peptides or proteins that increase the expression and solubility of fusion proteins. Fusion tags like GST and MBP also act as affinity tags and as a result, they are very popular for protein purification. Other fusion tags like NusA, thioredoxin (TRX), small ubiquitin-like modifier (SUMO), and ubiquitin (Ub), on the other hand, require additional affinity tags for use in protein purification.

No single fusion tag can increase the expression and solubility of all target proteins. However, some fusion tags have been more successful than others in increasing the solubility of many proteins. A comparison of some popular fusion tags showed that large proteins like NusA and MBP are more effective in solubilizing proteins than the smaller affinity tags or GST.4–7 Novel tags like Skip and T7 protein kinase, in turn, have been shown to be successful in expressing hard-to-express proteins in E. coli.8 Similarly, ubiquitin-based tags have been used to increase the solubility and expression level of proteins. SUMO tags are emerging as a viable alternative for increasing both the expression and solubility of otherwise hard-to-express proteins.9 The SUMO tag can be cleanly excised using SUMO protease, which recognizes the conformation of SUMO protein rather than a specific sequence within SUMO. Initially, the SUMO system was confined to E. coli, as highly conserved SUMO proteases are present in eukaryotes that cleaved the SUMO tag. However, the recently developed SUMOstar tag, a modified version of SUMO, is not recognized by the native eukaryotic protease and is specifically cleaved by the genetically engineered SUMOstar protease. Thus, the SUMOstar system can be used effectively in both prokaryotic and eukaryotic systems. The usefulness of the SUMO system was substantiated by the study of Marblestone, et al., who examined the effects of various fusion partners on total and soluble expression yield.9–10 They evaluated the expression and solubility of three model proteins fused to the C terminus of MBP, GST, TRX, NusA, Ub, and SUMO tags. The tags were ranked in terms of increased total expression as

TRX > SUMO ~ NusA > Ub ~ MBP ~ GST

and increased soluble expression as

SUMO ~ NusA > Ub ~ GST ~ MBP ~ TRX.

Overall, SUMO and NusA were equally good in terms of increasing the expression and the solubility of fusion proteins. However, SUMO offers certain advantages over NusA, in that SUMO is smaller and because it can be cleaved off precisely from the target protein without leaving behind any residues.

The Rainbow tag is yet another new development in tag fusion technology. This technology allows the continuous monitoring of correctly folded proteins throughout the process of expression and purification. The Xavin mononucleotide (FMN)-binding domain of cytochrome P450 reductase (displaying a blue-green or yellow color, depending on the oxidation state of the FMN cofactor) and the red colored, heme-binding cytochrome b5 are used as tags, and the rainbow tags are visible with the naked eye.11 The use of rainbow tags, however, requires an additional affinity tag for purification.


An integral part of the choice of a fusion tag is the choice of the method for removing the tag after purification. This step almost always involves using a protease to cleave a specific peptide bond between the tag and the protein of interest. A small number of highly specific proteases are routinely used for this purpose and are listed in Table 2. These include the tobacco etch virus (TEV) protease; thrombin (factor IIa, fIIa) and factor Xa (fXa) from the blood coagulation cascade; an enzyme involved in the cleavage or activation of trypsin in the mammalian intestinal tract, enterokinase (EK); proteases involved in the maturation and deconjugation of SUMO, SUMO proteases (Ulp1, Senp2, and SUMOstar); and a relative newcomer to the field, a mutated form of the Bacillus subtilis protease, subtilisin BPN' (Bio-Rad's Profinity eXact system). Many of these enzymes have been genetically engineered to enhance their stability (e.g., AcTEV, ProTEV) or their specificity, (e.g. SUMOstar, Profinity). With the exception of the SUMO proteases, all of these enzymes have the potential to cleave within the protein of interest.12–13 The SUMO proteases recognize not only their specific cleavage site, xaa-Gly-Gly/yaa, but also the tertiary structure of SUMO itself, giving them a very high degree of specificity. Bryan, et al., have attempted to introduce the same level of specificity into the Profinity system by mutating both the subtilisin prodomain as well as the active site of subtilisin to increase the affinity of the enzyme for the prodomain and to decrease the likelihood of digestion within the protein of interest.14 One interesting consequence of this is that the affinity for the prodomain is so high that these researchers observed product inhibition of the enzyme. Essentially, the enzyme carries out one catalytic cycle and is then inhibited by the prodomain, which is retained in the active site, thus preventing further cleavage by this otherwise promiscuous enzyme. Because capture on the immobilized, mutant subtilisin matrix is an integral part of the system, the column must have a capacity (in moles of subtilisin) equimolar with the fusion protein. Although this is not problematic on the research scale, it could become prohibitively expensive at the multigram scale.

Table 2. Proteases commonly used for tag removal

The principle concerns with using a protease for removing a tag are 1) removing the protease following digestion, and 2) non-specific digestion of the target protein by the protease. Resolving the first concern is relatively straightforward, although in most cases it involves an additional chromatography step. Recombinant forms of TEV and its variants and of the SUMO proteases are all produced with a hexahistidine (His6) tag, allowing easy removal of the enzyme by metal chelate chromatography. Alternatively, some of these enzymes have been immobilized on solid supports, allowing their removal by simple filtration or centrifugation steps. Thrombin, fXa, and EK, which generally are produced from natural sources, can be removed by affinity chromatography, for instance, on benzamidine-agarose. With the Profinity system, cleavage and separation from the enzyme are combined in a single step.

The second concern is more difficult to resolve. Non-specific cleavage is influenced by a number of parameters, such as the enzyme-to-substrate ratio (lower is better), temperature, pH, salt concentration, and length of exposure. TEV protease, thrombin, fXa, and EK all have well defined recognition sequences, but all of them have been found to cause "nicking" of the target protein in some instances. TEV protease has been re-engineered to try to increase its specificity (and stability), resulting in AcTEV (Invitrogen, Carlsbad, CA) and ProTEV (Promega, Madison, WI). Whether or not such engineering has reduced non-specific proteolysis remains to be seen. In addition, other tricks must be used with the native enzymes. For instance, one supplier recommends using fXa at pH 6.5, well below its pH optimum, to minimize non-specific cleavage. Of course, this requires the use of higher enzyme-to-substrate ratios and longer digestion times to achieve complete cleavage. Two of the enzymes listed (SUMO proteases and the Profinity enzyme) seem to be immune to this problem. SUMO proteases have evolved to recognize both the tertiary structure of SUMO as well as the cleavage sequence, xaa-Gly-Gly/yaa. The Profinity enzyme has been extensively mutated to derive a version that has very high affinity for the prodomain of the original enzyme. Thus, it also recognizes the tertiary structure of the prodomain as well as the cleavage sequence Phe-Met-Ala-Lys/yaa. On the other hand, SUMO proteases act catalytically (i.e., with a low enzyme-to-substrate ratio) whereas the Profinity enzyme requires equimolar concentrations of enzyme and substrate.

One final consideration should be mentioned. Although one would ideally have a protein that is fully soluble in phosphate buffered saline at neutral pH, the reality is that for many proteins to be soluble at useful concentrations, they require more acidic or more basic pH levels, high or low salt levels, or the presence of chaotropes or detergents. It is therefore essential that the protease of choice retain substantial activity under adverse conditions. The most robust of the enzymes cited appear to be the SUMO proteases, the Profinity enzyme, and the TEV protease. Thrombin, fXa, and EK are much more sensitive to high salt concentrations or to the presence of chaotropes or reducing agents.


Because every protein is unique, no single tag or cleavage method will answer every need. For proteins that express well, the simplest affinity tags may be sufficient (e.g., His6, myc). For harder to express proteins, fusion partners that enhance folding and solubility are preferable (e.g., MBP, SUMO). Tag removal then adds another layer of complexity. When considering which tag to use, key questions should be asked. For example, can your application tolerate retention of the tag, one or more amino acids remaining at the cleavage site, or must tag removal leave no trace? Such questions are usually answered experimentally, but with the availability of solubility-enhancing tags paired with highly specific proteases that cleanly remove the tag, these questions may be moot.

Dattananda Chelur is a senior scientists, Onur Unal is a business development associate, Marc Scholtyssek is a sales manager, and James Strickler is vice president of research, all at LifeSensors, Inc., Malvern, PA, 610.644.6973, ext. 324, unal@lifesensors.com


1. Arnau J, Lauritzen C, Petersen GE, Pedersen, J. Current strategies for the use of affinity tags and tag removal for the purification of recombinant proteins. Protein Expr Purif. 2006;48(1):1–13.

2. Esposito D, Chatterjee DK. Enhancement of soluble protein expression through the use of fusion tags. Curr Opin Biotechnol. 2006;17(4):353–8.

3. Banki MR, Feng L, Wood DW. Simple bioseparations using self-cleaving elastin-like polypeptide tags. Nat Methods. 2005;2(9):659–61.

4. Braun P, Hu Y, Shen B, Halleck A, Koundinya M, Harlow E, LaBaer J. Proteome-scale purification of human proteins from bacteria. Proc Natl Acad Sci. 2002;99(5):2654–9.

5. Hammarstrom M, Hellgren N, Van Den Berg S, Berglund H, Hard T. Rapid screening for improved solubility of small human proteins produced as fusion proteins in Escherichia coli. Protein Sci. 2002;11(2):313–21.

6. Niiranen L, Sigrun ES, Karlsen CR, Mustonen M, Paulsen SM, Heikinheimo P, Willassen NP. Comparative expression study to increase the solubility of cold adapted Vibrio proteins in Escherichia coli. Protein Expr Purif. 2007;52(1):210–18.

7. Shih YP, Kung WM, Chen JC, Yeh CH, Wang AH, Wang TF. High-throughput screening of soluble recombinant proteins. Protein Sci. 2002; 11(7): 1714–19.

8. Chatterjee DK, Esposito D. Enhanced soluble protein expression using two new fusion tags. Protein Expr Purif.2006; 46(1):122–9.

9. Butt TR, Edavettal SC, Hall JP, Mattern MR. SUMO fusion technology for difficult-to-express proteins. Protein Expr Purif. 2005;43(1):1–9.

10. Marblestone JG, Edavettal SC, Lim Y, Lim, P, Zuo X, Butt TR. Comparison of SUMO fusion technology with traditional gene fusion systems: Enhanced expression and solubility with SUMO. Protein Sci. 2006;15(1):182–9.

11. Finn RD, Kapelioukh I, Paine MJ. Rainbow tags: A visual tag system for recombinant protein expression and purification. Biotechniques. 2005;38(3):387–92.

12. Jenny RJ, Mann KG, Lundblad RL. A critical review of the methods for cleavage of fusion proteins with thrombin and factor Xa. Protein Expr Purif. 2003;31(1):1–11.

13. Selwood T, Wang Z-M, McCaslin DR, Schechter NM. Diverse stability and catalytic properties of human tryptase alpha and beta isoforms are mediated by residue differences at the S1 pocket. Biochem. 2002;41: 3329–40.

14. Ruan B, Fisher KE, Alexander PA, Doroshko V, Bryan PN. Engineering subtilisin into a fluoride-triggered processing protease useful for one-step protein purification. Biochem. 2004;43(46): 14539–46.