Using NMR Spectroscopy to Obtain the Higher Order Structure of Biopharmaceutical Products

August 2, 2010
Yves Aubin|Christopher Jones|Darón I. Freedberg

BioPharm International

Volume 2010 Supplement, Issue 6

Simple methods can characterize polysaccharide vaccines and recombinant cytokines at high resolution.


The higher order structure of complex biological therapeutics such as polysaccharide-containing vaccines, recombinant proteins, and monoclonal antibodies is an important quality attribute of biopharmaceutical products. The relationship between higher order structure and product efficacy is a crucial issue in comparability studies, be they assessments of manufacturing changes or follow-on biologics. Several biophysical methods, such as circular dichroism, fluorescence, and bioassays, are typical for assessing structure but none provide information at high resolution. Nuclear magnetic resonance (NMR) spectroscopy is a well-established technique for biomolecular structure determination, albeit underused in characterizing biotherapeutics. NMR is misperceived as too expensive or complicated and therefore is excluded from the methods toolbox to assess higher order structure. In this paper, we show simple applications of NMR that provide detailed information on higher order structure. We also address the complexity and cost of including NMR in the process and quality control environments.

From the start, the quantitative characterization and analysis of biologics has been a daunting undertaking. This task is complicated by the varied nature of biologics and compounded because many analytical methods have not been well suited to characterize this complex set of products. Consequently, most pharmaceutical companies rely on bioassays to demonstrate consistency with material used in the clinical trials where efficacy was proven.

(Photo Courtesy of DASGIP)

In this paper, we first show how nuclear magnetic resonance (NMR) has been used to successfully characterize many aspects of polysaccharide vaccines, then expand our scope to polysaccharide conjugate vaccines, and finally, indicate how NMR can be used in the characterization of complex biologics.

Carbohydrates Polysaccharide Vaccines

NMR spectroscopy is a versatile method that can be used to characterize molecules at the atomic level. After its use in characterizing small molecules (MW ≤1,000), NMR spectroscopy was used to characterize polysaccharide vaccines. Despite their high molecular weights, many of these heterogeneous molecules display very sharp lines in their 13 C and 1 H NMR spectra. Because NMR spectroscopy can yield information at atomic resolution, the components of complex molecules can be identified and quantified. Presently, 1 H NMR is used to provide identity, ascertain O-acetyl and N-acetyl proportions, determine water content, and monitor stability (through decomposition) of pneumococcal,1 meningococcal, and Haemophilus influenzae polysaccharide vaccines. Reliable quantification is easily accomplished as long as the fundamental principles of NMR, such as the guidelines of accurate data acquisition and nuclear relaxation, are followed.2 The dynamic range of NMR has increased, so impurities present even at the level of parts per billion can be detected.

An example of 1 H NMR use can be seen in Figure 1. The 1 H NMR spectra displays two different pneumococcal serotypes, 17A and 17F. The presence of either polysaccharide induces protective cross-reactive immune responses, thus rendering a typical ELISA assay inadequate to identify either polysaccharide in a vaccine.3 In sharp contrast to the ELISA identity assay, 1 H NMR can easily detect the difference between the two polysaccharides. The spectra in Figure 1 shows differences in the anomeric region and in the number of rhamnose and acetyl groups.

Figure 1. The 1H NMR spectrum of two different polysaccharides (PS) whose cognate antibodies cross-react. The spectra show differences between these two polysaccharides in the anomeric region (6.0 to 4.5 ppm), the acetyl region (2.5 to 1.8 ppm), and the methyl region (1.7 to 1.0 ppm).

Glycoconjugate Vaccines

Covalent attachment of polysaccharide chains to carrier proteins can produce vaccines with improved immunogenic properties. These vaccines are effective in infants because they induce isotype switching, thus producing high avidity antibodies and creating immunological memory. Such glycoconjugate vaccines are available against Haemophilus influenzae type b (Hib), four meningococcal serogroups, and up to 13 pneumococcal serotypes, with a variety of similar vaccines in development.4 Three basic structural classes of vaccines, referred to as neoglycoconjugates, cross-linked matrices, or vesicle vaccines, can be generated depending on the saccharide hapten (a high molecular weight polysaccharide, or a derived oligosaccharide), the conjugation chemistry, and the nature of the carrier protein (Figure 2).

Figure 2. Cartoon representations of two classes of glycoconjugate vaccine, showing (a) crosslinked matrix vaccines, and (b) neoglycoconjugate vaccines. A third class (not shown), glycoconjugates based on outer membrane vesicles, is less common because we have not be able to obtain NMR data from these samples.

We have obtained NMR (and other) data on the first two classes, with the most data available on the neoglycoconjugates. A vaccine using CRM197 as the carrier protein typically contains an average of six glycan chains, each of average molecular weight ca. 5,000 Da attached to amino groups (the N-terminus and the 39 ε-amino groups of lysine residues) in a nonrandom pattern, so the final conjugate is ~30% carbohydrate. Unfortunately, most carrier proteins are too large or heterogeneous for detailed NMR analysis (e.g., CRM197 at 58 kDa, tetanus at 250 kDa, and diphtheria toxoids at ca. 59 kDa).

Figure 3 shows the 500 MHz 1 H spectrum of a Hib glycoconjugate vaccine obtained at 30 °C. Resonances from the saccharide chains are sharp and at an identical chemical shift as those in the native polysaccharide. Conversely, resonances from the carrier protein are broad and ill defined. This suggests a model in which the carrier protein remains folded, a conclusion supported by circular dichroism (CD) data of CRM197 before and after conjugation to a synthetic hapten related to the pneumococcal Type 14 CPS.5 In this case, the CD spectra of the free carrier protein and the final conjugate were visibly indistinguishable. The glycan chains, on the other hand, remain extremely flexible and the conformational space is unaffected by conjugation. NMR analysis of meningococcal Group C vaccines produces similar conclusions.

Figure 3. Partial 500 MHz 1D 1H spectrum of a Hib-CRM197 conjugate vaccine obtained at 30 °C. The sharp resonances arise from the covalently attached glycan chains, which retain a very high degree of internal flexibility and have the same chemical shifts found in the native purified polysaccharide, while resonances from the carrier protein are broad and low intensity, reflecting the rapid relaxation of the native folded carrier protein. The inset shows a portion of the spectrum of a sample that has been deliberately degraded, highlighting peaks diagnostic of this.

An excellent method for quantifying the protein:polysaccharide ratio is the deliberate denaturation of the folded carrier protein by the addition of (deuterated) guanidinium hydrochloride or sodium deuteroxide. The formation of a flexible random coil results in sharpened protein resonances and the loss of sequence-specific variation in the chemical shifts of peptide resonances so protons in chemically identical locations resonate at the same frequency. The combination of these factors, and concomitant de-O-acetylation of the polysaccharide chain when base denaturation is used, allow the polysaccharide:protein ratio to be directly determined by integration of resonances from the glycan and carrier protein without recourse to methods of poor precision to independently quantify the saccharide and protein moieties (Figure 4).

Figure 4. Partial 500 MHz 1D 1H NMR spectrum at 70 °C of a pneumococcal conjugate vaccine, dissolved in deuterated water containing 5 M deuterium-exchanged guanidinium hydrochloride. Denaturation of the carrier protein and destruction of the secondary structure results in a more flexible random coil structure and loss of sequence-specific chemical shift variability. Resonances from the carrier protein, such as those from the sidechains of the aromatic amino acids and from the glycan chain, can be integrated and used to determine the polysaccharide-protein ratio directly.

Proteins Higher Order Structure Information by NMR

The structure of recombinant protein therapeutics is a critical quality attribute because it is directly related to efficacy. The word structure for protein drug substances such as cytokines and hormones includes three to four elements: the primary structure defined by the amino acid sequence; the secondary structure elements defined by helices, strands, loops and turns; the tertiary structure resulting from the assemblage of secondary structure elements; and in multi-subunit proteins, the quaternary structure defined by the relative positions of the polypeptides with respect to each other.

Currently, various physico-chemical methods (CD, FTIR, MS, fluorescence, peptide mapping) and biological assays provide various types of information such as the overall folding, the chemical integrity of the polypeptide, and its bioactivity. However, none of them can provide high-resolution assessments of the structure. Small conformational variations or mutations may be missed. Furthermore, bioassays cannot detect structural changes that have little or no effect on bioactivity or that may elicit serious adverse reactions in patients.

Protein Structure Determination in a Nutshell

NMR spectroscopy can provide a high degree of detail in the characterization of polypeptides. In particular, NMR can routinely determine the three-dimensional structure of proteins of molecular weights as large as 25 kDa, and structures of proteins as large as 40–60 kDa can be obtained in some cases. Compared to carbohydrates that are highly soluble in water and chemically stable, polypeptides pose their own particular challenges.

To collect high-resolution NMR data on protein samples, three conditions must be met. First, the target polypeptide must be sufficiently soluble in aqueous buffer to reach millimolar concentrations. Second, the pH of the solution preferably should be near or below neutral pH to minimize chemical exchange of the amide protons with the solvent. Third, under the above conditions, the polypeptide should maintain its chemical and structural integrity without self-association behavior. If the polypeptide folds into a globular tertiary structure, the resonances will be dispersed over a wide range of frequencies. This arises because NMR resonance frequencies are directly related to the magnetic environment surrounding the nuclei. Thus, an amide proton from a given residue type (e.g. alanine) will show different frequencies if it is surrounded by solvent molecules in an extended conformation than if it is buried in the hydrophobic core of the protein.

The applicability of NMR to proteins in the 12–25 kDa range can be impeded by resonance overlap arising from the high number of nuclei and by resonance linewidth resulting from slow molecular tumbling. The problem of resonance overlap may appear insurmountable, but it can be solved by the incorporation of carbon-13 and nitrogen-15 isotopes in the protein. These stable NMR-active isotopes have large spectral windows, affording greater resonance dispersion, thus allowing the collection of multidimensional spectra. The latter provide the data required to assign all resonances of all nuclei and to measure structurally related parameters such as torsion angles and inter-proton distances that are used to calculate three-dimensional structures.

Because de novo structure determination by NMR is a time-consuming process that requires high-level expertise, it has fuelled the perception that NMR is not adaptable to the quality controlled environment of manufacturing facilities. It is understandable that few in the NMR community and industry have recognized the potential of NMR to provide detailed structural information on recombinant protein therapeutic products in a simple way. This paper aims at changing this perception.

Obtaining Higher-Order Structure Information by NMR is Much Simpler Than it Appears

Peptide characterization of <15 residues can routinely be accomplished by 1 H-NMR spectroscopy.6 The underlying principle is a spectral comparison between the NMR spectra of the tested sample with a referenced standard under standardized sample conditions. This approach relies on a high degree of resolution in which little or no overlap of the many proton resonances is observed to facilitate the detection of small structural differences. To illustrate this statement, Figure 5 shows one-dimensional proton NMR spectra of goserelin and goserelin-related compound A. These are two synthetic non a-peptides differing by one chiral center (L- versus D-serine). On the other end, larger polypeptides produce very crowded spectra with significant loss of resolution. To alleviate this overlap, two-dimensional NMR can be used to improve resolution without resorting to isotopic enrichment. Recording NMR spectra at natural abundance is an indispensable condition for the analysis of protein-based drug products.

Figure 5. One-dimensional proton spectra recorded at 600 MHz on goserelin acetate and goserelin related compound A in 95% water using 0.02% TSP-d4 for chemical shift reference. The structure of the two molecules differs only by the L- versus D-serine residue indicated by a red arrow.

2D NMR is Sensitive to Small Structural Changes

An ideal NMR method should probe all residues of the protein simultaneously with atomic resolution to detect any minor structural changes. In addition, the experimental setup and data analysis should be straightforward enough that is can be conducted by non-experts in NMR. The 1 H-15 N (proton-nitrogen) correlation experiment named 2D-HSQC meets these requirements. This experiment records the chemical shifts of all 1 H-15 N pairs from backbone amides and nitrogen containing side-chains. The chemical shifts of each peak on the 2D contour map are related to the nature of the amino acid residue and its local environment. Any variation of the latter (e.g., solution conditions such as pH, salt concentration, local conformation) perturbs the resonance frequencies of that peak and its neighbors. Figure 6 shows the 2D correlation spectra overlay of GM-CSF and an N27D mutant, and interferon alpha-2a and alpha-2b where lysine at position 23 in subtype 2a is replaced by an arginine residue in 2b. These spectra are examples showing how simple mutations induce many chemical shift perturbations, thus making the 2D-HSQC a sensitive experiment for the detection of small changes in the structure of the protein. Similar changes can be observed with small changes of solution conditions such as ionic strength and pH.

Figure 6. Two-dimensional 1H- 15N-HSQC contour maps of rhGM-CSF and N17D mutant (top panel) and interferon alpha-2 subtypes 'a' and 'b' on the bottom panel. In both cases, a single mutation induces unmistakable chemical shift differences that are easily observed.

Probing Higher-Order Structure Without Isotope- Labeling is Straightforward

Because any change of the local or global environment around amide pairs will produce a different pattern in the 2D-HSQC contour map, each spectrum in Figure 6 represents a spectroscopic fingerprint for each particular protein. Thus, such fingerprints are sensitive tools to assess the conformation of a recombinant protein therapeutic.7 A comparison of the structure of GM-CSF from two different sources is shown in Figure 7. All resonances assigned to the protein backbone amides of GM-CSF (prepared in E. coli and refolded) match the equivalent resonances of Leucotropin (prepared by Cangene by their proprietary process, in which GM-CSF is secreted in its active conformation from S. cervisiae). The spectra in Figures 6 and 7 show the assignment of all backbone amide resonances for interferon alpha-28 and rhGM-CSF.9 The assignment is useful but not essential. If available, it helps to observe the presence of impurities or local conformational variability or binding of ligands.

Figure 7. Overlay of the two dimensional 1H- 15N-HSQC of rhGM-CSF in red (refolded from E. coli) and Leucotropin in black (secreted from S. cervisiæ).

In the absence of previous NMR studies, the bioactivity of the target protein must be assessed with bioassays. Reference spectra can then be collected. Subsequent batches of proteins can be compared with this reference to assess comparability. NMR studies of GM-CSF were not initially available. We prepared the protein and carried out a TF-1 cell proliferation assay to ensure that the NMR data were representative of the active protein. On the other hand, the structures of interferon alpha-2a and 2b were compared with their respective chemical reference standards obtained from the European Directorate for Quality Medicine using the NMR fingerprint assay.10

The pattern of the resonances on a 2D-HSQC spectrum—the fingerprint—is directly correlated to the structure of the protein under defined solution conditions (buffer, pH, ionic strength, co-solutes). Comparison of such spectra allows a direct assessment of the conformation at atomic level resolution.

Practical Considerations

The required solution conditions for samples suitable for recording high-resolution NMR spectra may appear demanding. However, the knowledge pertaining to the biochemical properties gathered by manufacturers during product development provides a good basis that will facilitate the optimization of sample parameters for NMR studies. In most cases, the optimization process appears more difficult than it really is. In addition, obtaining sufficient amounts of protein to meet millimolar concentrations is far from a limiting factor for industry. The approach used in the NMR fingerprint assay has been applied for decades by NMR specialists in their studies of protein structures using 13 C,15 N-labeled proteins. The development of cryogenic probes greatly lowered the limit of detection of NMR spectrometers and allowed collection of 2D-HSQC at natural abundance on recombinant proteins with molecular weights up to 20 kDa in a reasonable time. The increase in sensitivity afforded by cryoprobes is so significant that a 400 MHz spectrometer can be used for protein structure analysis. This intermediate frequency spectrometer often is the instrument of choice in manufacturing and quality control environments. Therefore, to use NMR to assess the structure of protein products, only purchasing a cryoprobe for existing instruments would be necessary.

Finally, when all required elements are in place, assessing the structure of a protein takes little time. Sample preparation needs just a few hours if buffer exchange is necessary, and spectrometer setup takes only a few minutes. After initiation, data collection typically is fully automatic, requiring no involvement of the user and taking between 24–96 h, depending on protein concentration and sensitivity (signal-to-noise ratio) of the instrument. Data processing takes seconds and analysis requires overlaying the new 2D map with a reference map to ensure similarity.


Although NMR instruments have a relatively high initial capital outlay, all major manufacturers of polysaccharide and conjugate vaccines now use NMR spectroscopy to assess the identity and purity of the saccharide component, and NMR is a pharmacopoeial identity test for heparin. It also is proposed as an identity test for peptide hormones. In addition, companies specialized in providing analytical services to small biotech companies may see opportunities by investing in NMR hardware or purchasing spectrometer time at NMR laboratories on university campuses that already offer NMR services to local industries. The increasing availability of these instruments in biopharmaceutical manufacturing environments will allow them to be further exploited for tests that would otherwise not justify the resources. The growth in the use of NMR spectroscopy in biopharmaceutical quality control is influencing the acceptance and even expectation of these data by regulatory authorities. The analytical challenges of biosimilars have increased the need for analytical methods able to distinguish small structural differences between molecules. NMR spectroscopy meets this criterion, and provides one of a limited number of methods able to provide detailed information on higher order structure.


The authors thank Cangene for generously providing samples of Leucotropin and Fabian Jameison, PhD, of the US Pharmacopeia for providing goserelin acetate and goserelin-related compound A.

YVES AUBIN leads the NMR laboratory at the Protein Structure and Analysis Laboratories at BGTD, Health Canada, Ottowa, Canada, 613.941.6155, yves.aubin@hc-sc.gc.caCHRISTOPHER JONES is the division head at the Laboratory for Molecular Structure at the National Institute for Biological Standards and Control, Herts, UK, and DARÓN I. FREEDBERG leads the NMR laboratory at the Center for Biologics Evaluation and Research, US Food and Drug Administration, Rockville, MD.


1. Abeygunawardana C, Williams TC, Sumner JS, Hennessey JP. Development and validation of an NMR-based identity assay for bacterial polysaccharides. Anal Biochem. 2000;279:226–40.

2. Cavanagh J, Fairbrother W, Palmer A, Rance M, Skelton N. Protein NMR Spectroscopy, 2nd Ed. Academic Press: San Diego, 2007.

3. Frasch CE, Concepcion N. Induction of group 17 specifi antibodies by pneumococcal type 17F and 17A polysaccharide vaccines. Biologicals. 2001;29(1):11–16.

4. C. Jones. Vaccines based on the cell-surface carbohydrates of pathogenic bacteria. An Bras Acad Cienc. 2005;77:293–324.

5. Mawas F, Niggemann J, Jones C, Corbel MJ, Kamerling JP, Vliegenthart JFP. A conjugate vaccine made with a synthetic single repeating unit of Type 14 pneumococcal polysaccharide coupled to CRM197 is immunogenic in a mouse model. Infect Immun. 2002;70:5107–114.

6. Kellenbach E, Sanders K, Overbeeke PLA. The use of proton NMR as an alternative for the amino acid analysis as identity test for peptides. In: Holzgrabe U, Wawer I, Diehls B, editors. NMR spectroscopy in Pharmaceutical analysis. Oxford, UK: Elsevier; 2008. p 429–36

7. Aubin Y, Gingras G, Sauvé S. Assessment of the three-dimensional structure of recombinant protein therapeutics by NMR fingerprinting: demonstration on rhGM-CSF. Anal Chem. 2008;80:2623–7.

8. Klaus W, Gsell B, Labhardt AM, Wipf B, Senn H. The three-dimensional high resolution structure of human interferon alpha-2a determined by heteronuclear NMR spectroscopy in solution. J Mol Biol. 1997;274:661–75.

9. Sauvé S, Gingras G, Aubin Y. NMR assignment of backbone and side chain resonannces for human granulocyte-macrophage colony-stimulating factor. Biomol NMR Assign. 2008;2:5–7.

10 . Panjwani N, Hodgson DJ, Sauvé S, Gingras G, Aubin Y. Assessment of the Effects of pH, Formulation, and Deformulation on the Conformation of Interferon alpha-2 by NMR. J Pharm Sci, Published Online: 23 Feb 2010, DOI 10.1002/jps.22105.