The most recent FDA (1) and International Conference on Harmonization (ICH) (24) guidance documents advocate a new paradigm of process validation based on process understanding and control of parameters and less on product testing. Consequently, the means of determining criticality has come under greater scrutiny. The FDA guidance points to a lifecycle approach to process validation (see Figure 1).

In Part I of this series, the author introduced the concept of continuum of criticality and applied it to the concepts of critical quality attributes (CQAs) and critical process parameters (CPPs). In the initial phase, the CQAs had their criticality risk level assigned according to the severity of risk to the patient. Applying a causeandeffect matrix approach, the potential impact of each unit operation on the final product CQAs was assessed and each unit operation was thoroughly analyzed for its directly controllable inputs and outputs. Finally, a qualitative risk analysis or a formal failure mode effects and criticality analysis (FMECA) was conducted for each of the identified process parameters. The purpose of this assessment is to provide a focus for the downstream process characterization work required to complete process validation Stage 1 (process design).
This initial risk assessment is performed prior to the baseline characterization work and can be used as the primary means of determining the criticality of process parameters under the following conditions:
• When a platform process that possesses similar properties and process to another commercial product (e.g., new strength or new dosage form)
• When there is a significant body of published data on the process
• When experimental studies and commercial data are available, such as when the process validation lifecycle is applied to a legacy product to substantiate the initial assessment.
In these cases, this initial assessment can be further bolstered through the addition of an uncertainty component to the traditional risk score. For example, a highrisk critical parameter with low uncertainty (due to substantial supporting data) may not require further study, but a mediumrisk parameter with high uncertainty may require further experimentation to quantify the risk to product performance.
The challenge facing most organizations is how to effectively evaluate the impact of potentially hundreds of process parameters on product performance to determine what is truly critical. Few companies have the time or resources to design experimental studies around all potentially critical process parameters. The initial risk assessment provides a screening tool to sort out the parameters that have low or no risk.
Design space and design of experiments
The goal is to increase process knowledge by providing a mechanistic understanding of the relationship between process parameters, raw material attributes, and CQAs. This is defined as both the demonstration of impact and the quantification of the contribution of each parameter to the product’s performance. Through this exercise, it will be possible to identify the process design space. The ICH guidance defines three elementsknowledge space, design space, and control spaceto establish a process understanding (see Figure 2) (2).

ICH Q8 defines design space as, “The multidimensional combination and interaction of input variables (e.g., material attributes) and process parameters that have been demonstrated to provide assurance of quality.”
The design space is part of an enhanced process development approach referred to as quality by design (QbD). Prior to QbD, pharmaceutical development did not require the establishment of functional relationships between CPPs and CQAs. Consequently, process characterization experiments were primarily univariate (one factor at a time [OFAT]), showing that, for a given range of a process parameter (referred to as proven acceptable range or PAR), the CQAs meet acceptance criteria. While univariate experiments can provide some limited knowledge, a compilation of OFAT studies cannot typically define a design space because it cannot substantiate the importance or contribution of the parameter to the product CQA being evaluated. To do this, multivariate studies must be performed to account for the complexities of interactions when several CPPs vary across their control ranges.
Design spaces can be developed for each unit operation or across several or all unit operations. Although it may be simpler to develop for each unit operation, downstream unit operations may need to be included to sample and test the appropriate CQAs. For example, to perform a multivariate study on a fermentation unit operation, additional processing through cell lysis and purification unit operations is needed so that CQAs may be sampled and tested. The challenge faced by most development programs is how to efficiently and costeffectively derive maximum process understanding in the fewest number of studies. To do this, a staged approach using multiple studies is most efficient.
A staged design of experiment approach
The following is an example of a simple staged design of experiment (DOE) approach. More complex DOE designs and strategies may be required, but these designs are typical:
• Screening (fractional factorial, PlackettBurman). To identify or screen out process parameters that have no significant impact on a CQA. Screening designs can test main (individual impact and contribution) effects of each parameter being evaluated.
• Refining (full factorial). Having dropped out parameters, which do not impact the product CQAs, the refining step tests both main effects and interactions between the remaining parameters and generates firstorder (linear) relationships between process parameters and CQAs. The criticality level of a CPP is determined from the quantitative impact on the CQA shown in the modeled relationship.
• Optimization (central composite, BoxBehnken). To generate response surfaces and illustrate secondorder (quadratic) relationships between process parameters and CQA. This analysis allows optimal set points for the design space or control space to be identified to target desired CQA (or performance attributes) values.
The DOE design assists in determining which parameters are studied and what set point value is used for each experimental run. The initial risk assessment, using prior knowledge and scientific principles, provides an expected relationship as to which CQAs and their related inprocess controls will be affected by the given process parameters. Although the focus is on quality impact, process performance attributes (no quality impact) should also be sampled and measured as appropriate. This step is especially important during the optimization stage because a tradeoff may be required in terms of optimizing quality and performance attributes.

For process validation Stage 1 process characterization studies, analytical methods for measuring CQAs may not yet be fully validated, but still, must be scientifically sound.
The level of accuracy and precision of the analytical method or measurement system must be well understood because they directly impact the quantitative decision process when interpreting study results early in the processdesign stage. Techniques of measurement system analysis such as Gage repeatability and reproducibility (Gage R&R) studies are recommended because they provide information on the variability of the measurement system. The Gage R&R study provides a quantitative measurement of the measurement tools contribution to variation for any measurement made. Typically, a percent contribution from R&R variability must be < 20% and demonstrate at least five distinct categories for the method result to be meaningful. The distinct categories are the number of discernable groups of measurements that can reliably span the range of the CQA.
Including replicate runs in addition to the study experimental design provides crucial data for estimating the underlying variability of the study. This is because, during each run, small unmeasured and uncontrolled variations always occur and may influence the result. Two otherwise identically configured runs may produce slightly different responses due to changes in environment, equipment, measurement, sampling, and operators, among others. Even deliberately fixed parameters (those not under study) may not be exactly identical from run to run. Together, these are called “noise” factors and are important in discerning true responses (i.e., “signal”) caused by the changing parameters from the inherent variability. Differences between sets of replicate runs allow for the quantification of this variability. Large changes in the responses between replicates may indicate either an unstable experimental platform (such as poor runtorun control) or that a lowrisk CPP or nonCPP, may have a higher impact on the CQAs than originally assessed.
Where a raw material has a critical material attribute (CMA) of mediumrisk to highrisk impact to CQAs, it should be included as a parameter of the study where possible. Multiple lots or lots with extreme variation in CMAs may not always be available during early development or characterization studies. This limitation frequently is one of the primary drivers of establishing the continuous process verification (CPV) program in Stage 3 to monitor the future impact from this raw material variation. For large studies, multiple lots of raw materials may be required. Consideration should be given to either proportional mixing of the raw material lots for each run or use of a statistical technique called blocking, which incorporates change of material lots into the experimental design.
In each design, the choice must not only be made on the number of parameters to be studied, but at how many levels (i.e., set points within the range) and how many times a particular set of conditions is repeated (replication).
The number of levels is related to the mathematical relationship between the parameters and the CQA measured (e.g., two levels for linear or three for quadratic). For screening designs, it is typical to use only two levels (minimum and maximum of the range); for these designs, any known nonlinear relationship may have to be mathematically transformed. For refining design, center points (midpoint of ranges for all parameters) are added to estimate variability and to detect potential curvature.
Because of the cost involved or availability of API, it is not possible to perform all experimental studies at commercial scale (such as with fermentors of 5000 to 25,000 L), hence, most biotech process development programs rely heavily on modeling the process at smaller or intermediate scale. Some process parameters may be independent of the scale or may have simple models to account for scale changes. Scale itself may be considered a parameter. Establishing similar run conditions at multiple scales is an important consideration when trying to qualify the comparability between fullscale and smallscale experiments. Substantial prior experience with scaling particular unit operations may provide key information such as dimensionless parameter groups and scaling equations.
Areas considered for experimental scale include, but are not limited to:
• Aspect ratios of bioreactors and mixing tanks
• Impeller number, size, and location
• Aeration method and effectiveness of oxygen transfer
• Location of addition ports and effect on mass transport and uniformity
• Temperature control and heattransfer surface area
• Location of instrument sensors and controlloop tuning parameters.
Screen, refine, and optimize
The advantage of a screening design is that it can handle a fairly large number of parameters in the fewest number of runs. The disadvantage is that the interaction effect of each CPP on a CQA cannot be directly determined because the experimental parameters are confounded. Confounding refers to a scientific state where there is insufficient resolution in the experiment to resolve the interaction effects from the main contributions of each parameter studied. However, at the screening stage, the objective is to eliminate as many parameters from the potential list of CPPs as possible so that the true process design space can be determined in the refining studies.
At this stage, the criticality of parameters has not yet been verified and parameter control ranges (proven acceptable range) have not yet been determined. Although it is usually the goal to meet the CQAs’ acceptance criteria to ensure product quality, the purpose here is to show how the process responds to the parameters even if the CQAs may not meet their criteria.

Figure 3 is an example output chart from a screening study. This Pareto chart shows the standardized effect or relative impact, for each of eight process parameters on a CQA. A reference line at 2.45 is the threshold below which the parameter’s effect is not statistically significant for this study (pvalue > alpha of 0.05). In this example, six of the parameters may be screened out of further studies, provided they do not produce significant effects for other CQAs. A similar approach can be used for process performance attributes (nonCQAs) to evaluate parameters that impact process performance, but not quality; these noncritical parameters are frequently called key parameters. If these investigations had been conducted as OFAT studies, it would be impossible to quantitatively determine which parameter had an impact on the product’s CQA and to what extent. Through the use of a DOE, it is possible to measure both and define the level of variation explained by the parameters evaluated based upon the data observed.
Once the screening DOE has been completed, parameters that have not shown strong responses to any of the CQAs are now kept constant or well controlled to reduce the number of parameters for refining studies. By employing a full factorial design, all main effects and interactions are separated with regard to the CQA responses; there is no confounding in a fullfactorial design. Centerpoint conditions (runs at the midpoints of all parameter ranges) are recommended because they can be used to detect if significant curvature (nonlinear relationship) exists in the response to the parameter and they provide replication to determine the inherent variability in the study.

Figure 4 is an example output chart from a fourparameter, fullfactorial study. The Pareto chart shows a threshold line. Two parameters and one twofactor interaction are statistically significant for this CQA. All parameters and interactions below this threshold are not statistically significant and their effects have no more impact than the inherent runtorun variation.

Because these parameters and interactions are not significant, they may be treated as random noise and the model for this attribute is reduced as shown in Figure 5. A mathematical model was generated using the significant factors (pH and temperature) from Figure 3 and the significant interaction (pH and dissolved oxygen [DO]):
Impurities = Constant + α(pH) + β(temperature) + γ(DO) + δ(pH) (DO)
where: Constant is the intercept generated by the DOE analysis
α, β, γ, and δ are the coefficients generated by the DOE analysis for each parameter or interaction.
Positively signed coefficients indicate the CQA increases with an increase of the parameter; negatively signed coefficients indicate the CQA decreases with an increase of the parameter. The model equation is a regression, or best fit, from the data for the experiment, and therefore, is valid for the specific scale conditions of the experiment including the ranges of the parameters tested. Models are tested for their “goodness of fit” or how well the model represents the data. The simplest of these tests is the coefficient of determination, or Rsquared. Low Rsquared values (such as below 50%) indicate models with low predictive capability, that is, the parameters evaluated across the defined range do not explain the variation seen in the data.
This model only represents what would be expected on average for this CQA from the unit operation(s) tested in the study. Even so, the model is a fit to the most likely mean. Recognizing that any model has uncertainty, the model can also be represented with a confidence interval (e.g., 95%) around that mean. Individual runs will also show daytoday variation around that mean. A singlerun value for the attribute cannot be predicted, but a range in which that value will likely fall can be predicted. This range for the singlerun value is called the prediction interval (e.g., 95%) for the model. Empirical models such as these are only as good as the data and conditions from which they are generated and are mere approximations of the real world.
Despite the limitation, these empirical models relate not only what parameters have a statistical impact on a CQA, but also the relative amount of that impact. The range through which the parameter is tested in the study has an important relationship to the model generated. For example, perhaps the parameter temperature was initially assigned as high risk. If temperature is only tested through a tight range, the parameter may have little to no impact to CQAs in the study; its effect may be no greater than the inherent variability. If temperature is not statistically significant for the range studied (i.e., its PAR), it is designated as a nonCPP, but only for that PAR. If the temperature should ever move outside the studied PAR, there is a potential risk that it could have a quality impact become critical.
Some organization quality groups rely on the original risk assessment of the process parameter. If this parameter’s severity was initially rated high, this parameter can remain designated as critical but should be designated as a lowrisk CPP as long as the parameter is in its PAR. Parameters outside the PAR would be considered outside the allowable limits for that process step because there has been no study of the parameter outside of this range.
If curvature is detected during earlier DOE stages, or if the optimization of any CQA or process performance attribute is needed, then responsesurface experimental designs are used. These designs allow for more complex model equations for a CQA (or performance attributes). Two of the simpler responsesurface designs are the central composite and BoxBehnken. Both designs can supplement existing fullfactorial data. The centralcomposite design also extends the range of parameter beyond the original limits of the factorial design. The BoxBehnken design is used when extending the limits is not feasible. The empirical models are refined from these studies by adding higherorder terms (e.g., quadratic, polynomial). Even if these higherorder terms are not significant, adding more levels within the parameter ranges will improve the linear model.
Because most empirical models are developed with smallscale experiments, the models must be verified on larger scale and potentially adjusted. Applying the knowledge of scaledependent and scaleindependent parameters while developing earlier DOE designs reduces risk when scalingup to larger pilotscale and finally fullscale processes. The models from smallscale studies predict which parameters present the highest impact (risk) to CQAs. Priority should be given in the study design to those highrisk parameters, especially if they are scaledependent. Since the empirical models only predict the most likely average response for a CQA, several runs at different parameter settings (e.g., minimum, maximum, center point) are required to see if the smallscale model can still apply to the largescale process.
Significance and criticality
Statistical significance is an important designation in assessing the impact of changes in parameters on CQA. It provides a mathematical threshold of where the effects vanish into the noise of process variability. Parameters that are not significant are screened out from further study and excluded from empirical models.
A CQA may be affected by critical parameters in several different unit operations (see the CauseandEffect Matrix in Part 1 of this article [5]). Characterization study plans may not be able to integrate different unit operations into the same DOE study. Consequently, several model equations may exist for a single CQA; each model is composed of parameters from different unit operation. The relative effect of each parameter on the CQA can be calculated from these models using the span of the PAR for each parameter. The relative impact of each parameter on the CQA is based on the range of its acceptance criteria. Sorting each parameter from highest to lowest, the criticality of each parameter can be assigned from high to low. Table I is an example of one method for assigning the continuum of criticality.
The steps in determining the continuum of criticality for process parameters are summarized as follows:
• Show statistically significance by DOE
• Relate significant parameters to CQAs with empirical model
• Calculate impact of all parameter from model(s) for CQA
• Compare the parameter’s impact on the CQA to the CQA’s measurement capability
• Assign parameter risk level based on impact to CQA
• Update initial risk assessment for parameters.
CPP risk level 
% Change in CQA as CPP spans PAR 
High risk 
> 25% 
Medium risk 
10% to 25% 
Low Risk 
< 5% 
Below measurement capability 

Low risk in risk assessment (not in DOE) 

NonCPP 
Not significant in DOE 
No risk in risk assessment (not in DOE) 
Table I: Example of criticality risk assignment for process parameters. CPP is critical process parameter;
CQA is critical quality attribute; PAR is proven acceptable range; DOE is design of experiment.
As process validation Stage 2 (process qualification) begins, criticality is applied to develop acceptance criteria for equipment qualification and process performance qualification. Finally, in process validation Stage 3 (continued process verification), criticality determines what parameters and attributes are monitored and trended.
In the third and final part of this article, the author applies the continuum of criticality for parameters and attributes to develop the process control strategy and study its influence on the process qualification and continued process verification stages of process validation.
References
1. FDA, Guidance for Industry, Process Validation: General Principles and Practices, Revision 1 (Rockville, MD, January 2011).
2. ICH, Q8(R2) Harmonized Tripartite Guideline, Pharmaceutical Development, Step 4 version (August 2009).
3. ICH, Q9 Harmonized Tripartite Guideline, Quality Risk Management (June 2006).
4. ICH, Q10, Harmonized Tripartite Guideline, Pharmaceutical Quality System (April 2009).
5. M. Mitchell, BioPharm International 26 (12) 3847 (2013).
Mark Mitchell is principal engineer at Pharmatech Associates.