Protein Structure Determination in a Nutshell
NMR spectroscopy can provide a high degree of detail in the characterization of polypeptides. In particular, NMR can routinely
determine the three-dimensional structure of proteins of molecular weights as large as 25 kDa, and structures of proteins
as large as 40–60 kDa can be obtained in some cases. Compared to carbohydrates that are highly soluble in water and chemically
stable, polypeptides pose their own particular challenges.
To collect high-resolution NMR data on protein samples, three conditions must be met. First, the target polypeptide must be
sufficiently soluble in aqueous buffer to reach millimolar concentrations. Second, the pH of the solution preferably should
be near or below neutral pH to minimize chemical exchange of the amide protons with the solvent. Third, under the above conditions,
the polypeptide should maintain its chemical and structural integrity without self-association behavior. If the polypeptide
folds into a globular tertiary structure, the resonances will be dispersed over a wide range of frequencies. This arises because
NMR resonance frequencies are directly related to the magnetic environment surrounding the nuclei. Thus, an amide proton from
a given residue type (e.g. alanine) will show different frequencies if it is surrounded by solvent molecules in an extended
conformation than if it is buried in the hydrophobic core of the protein.
The applicability of NMR to proteins in the 12–25 kDa range can be impeded by resonance overlap arising from the high number
of nuclei and by resonance linewidth resulting from slow molecular tumbling. The problem of resonance overlap may appear insurmountable,
but it can be solved by the incorporation of carbon-13 and nitrogen-15 isotopes in the protein. These stable NMR-active isotopes
have large spectral windows, affording greater resonance dispersion, thus allowing the collection of multidimensional spectra.
The latter provide the data required to assign all resonances of all nuclei and to measure structurally related parameters
such as torsion angles and inter-proton distances that are used to calculate three-dimensional structures.
Because de novo structure determination by NMR is a time-consuming process that requires high-level expertise, it has fuelled the perception
that NMR is not adaptable to the quality controlled environment of manufacturing facilities. It is understandable that few
in the NMR community and industry have recognized the potential of NMR to provide detailed structural information on recombinant
protein therapeutic products in a simple way. This paper aims at changing this perception.
Obtaining Higher-Order Structure Information by NMR is Much Simpler Than it Appears
Peptide characterization of <15 residues can routinely be accomplished by 1 H-NMR spectroscopy.6 The underlying principle is a spectral comparison between the NMR spectra of the tested sample with a referenced standard
under standardized sample conditions. This approach relies on a high degree of resolution in which little or no overlap of
the many proton resonances is observed to facilitate the detection of small structural differences. To illustrate this statement,
Figure 5 shows one-dimensional proton NMR spectra of goserelin and goserelin-related compound A. These are two synthetic non
a-peptides differing by one chiral center (L- versus D-serine). On the other end, larger polypeptides produce very crowded
spectra with significant loss of resolution. To alleviate this overlap, two-dimensional NMR can be used to improve resolution
without resorting to isotopic enrichment. Recording NMR spectra at natural abundance is an indispensable condition for the
analysis of protein-based drug products.
Figure 5. One-dimensional proton spectra recorded at 600 MHz on goserelin acetate and goserelin related compound A in 95%
water using 0.02% TSP-d4 for chemical shift reference. The structure of the two molecules differs only by the L- versus D-serine residue indicated
by a red arrow.
2D NMR is Sensitive to Small Structural Changes
An ideal NMR method should probe all residues of the protein simultaneously with atomic resolution to detect any minor structural
changes. In addition, the experimental setup and data analysis should be straightforward enough that is can be conducted by
non-experts in NMR. The 1 H-15 N (proton-nitrogen) correlation experiment named 2D-HSQC meets these requirements. This experiment records the chemical shifts
of all 1 H-15 N pairs from backbone amides and nitrogen containing side-chains. The chemical shifts of each peak on the 2D contour map
are related to the nature of the amino acid residue and its local environment. Any variation of the latter (e.g., solution
conditions such as pH, salt concentration, local conformation) perturbs the resonance frequencies of that peak and its neighbors.
Figure 6 shows the 2D correlation spectra overlay of GM-CSF and an N27D mutant, and interferon alpha-2a and alpha-2b where
lysine at position 23 in subtype 2a is replaced by an arginine residue in 2b. These spectra are examples showing how simple
mutations induce many chemical shift perturbations, thus making the 2D-HSQC a sensitive experiment for the detection of small
changes in the structure of the protein. Similar changes can be observed with small changes of solution conditions such as
ionic strength and pH.
Figure 6. Two-dimensional 1H- 15N-HSQC contour maps of rhGM-CSF and N17D mutant (top panel) and interferon alpha-2 subtypes 'a' and 'b' on the bottom panel.
In both cases, a single mutation induces unmistakable chemical shift differences that are easily observed.
Probing Higher-Order Structure Without Isotope- Labeling is Straightforward
Because any change of the local or global environment around amide pairs will produce a different pattern in the 2D-HSQC contour
map, each spectrum in Figure 6 represents a spectroscopic fingerprint for each particular protein. Thus, such fingerprints
are sensitive tools to assess the conformation of a recombinant protein therapeutic.7 A comparison of the structure of GM-CSF from two different sources is shown in Figure 7. All resonances assigned to the
protein backbone amides of GM-CSF (prepared in E. coli and refolded) match the equivalent resonances of Leucotropin (prepared by Cangene by their proprietary process, in which
GM-CSF is secreted in its active conformation from S. cervisiae). The spectra in Figures 6 and 7 show the assignment of all backbone amide resonances for interferon alpha-28 and rhGM-CSF.9 The assignment is useful but not essential. If available, it helps to observe the presence of impurities or local conformational
variability or binding of ligands.
Figure 7. Overlay of the two dimensional 1H- 15N-HSQC of rhGM-CSF in red (refolded from E. coli) and Leucotropin in black (secreted from S. cervisić).
In the absence of previous NMR studies, the bioactivity of the target protein must be assessed with bioassays. Reference spectra
can then be collected. Subsequent batches of proteins can be compared with this reference to assess comparability. NMR studies
of GM-CSF were not initially available. We prepared the protein and carried out a TF-1 cell proliferation assay to ensure
that the NMR data were representative of the active protein. On the other hand, the structures of interferon alpha-2a and
2b were compared with their respective chemical reference standards obtained from the European Directorate for Quality Medicine
using the NMR fingerprint assay.10
The pattern of the resonances on a 2D-HSQC spectrum—the fingerprint—is directly correlated to the structure of the protein
under defined solution conditions (buffer, pH, ionic strength, co-solutes). Comparison of such spectra allows a direct assessment
of the conformation at atomic level resolution.
The required solution conditions for samples suitable for recording high-resolution NMR spectra may appear demanding. However,
the knowledge pertaining to the biochemical properties gathered by manufacturers during product development provides a good
basis that will facilitate the optimization of sample parameters for NMR studies. In most cases, the optimization process
appears more difficult than it really is. In addition, obtaining sufficient amounts of protein to meet millimolar concentrations
is far from a limiting factor for industry. The approach used in the NMR fingerprint assay has been applied for decades by
NMR specialists in their studies of protein structures using 13 C,15 N-labeled proteins. The development of cryogenic probes greatly lowered the limit of detection of NMR spectrometers and
allowed collection of 2D-HSQC at natural abundance on recombinant proteins with molecular weights up to 20 kDa in a reasonable
time. The increase in sensitivity afforded by cryoprobes is so significant that a 400 MHz spectrometer can be used for protein
structure analysis. This intermediate frequency spectrometer often is the instrument of choice in manufacturing and quality
control environments. Therefore, to use NMR to assess the structure of protein products, only purchasing a cryoprobe for existing
instruments would be necessary.
Finally, when all required elements are in place, assessing the structure of a protein takes little time. Sample preparation
needs just a few hours if buffer exchange is necessary, and spectrometer setup takes only a few minutes. After initiation,
data collection typically is fully automatic, requiring no involvement of the user and taking between 24–96 h, depending on
protein concentration and sensitivity (signal-to-noise ratio) of the instrument. Data processing takes seconds and analysis
requires overlaying the new 2D map with a reference map to ensure similarity.