Equation (1) can be used to compute tolerance intervals for the combined large-scale and bench data sets. There are several
values that can be considered for.Y mean One approach is to center at the predicted value of the PP when all OPs are at setpoint values. If it is known that there
is an "offset" between the bench-scale data such as Edge of Range (EOR) and robustness (ROB); and large-scale (GMP and non-GMP)
data, it might be better to center the interval at the large-scale mean. Figure 2 presents such a situation for one PP. The
unweighted average of the four groups is 11.4. It is noted that all values of the large-scale GMP runs are less than 11.4,
so that one may wish to center the interval at a lesser value. The p-value for the test of equal means among the four groups is less than 0.03 for this example.
Alternative centering rules may also be considered when different lots of a key raw material were used for each of the large-scale
runs, but the same material (but from a different lot) was used for all of the bench scale runs. Here it might be best to
center the interval on a linear combination of the large-scale and bench-scale means.
Scenario 3: In this scenario, tolerance intervals are calculated accounting for OPs that vary across the OR. Typically, OPs will vary
around the setpoint value due to instrument and equipment tolerances and other factors. Thus, a tolerance interval that describes
behavior of the PP must adequately account for this variation in the OP. The formula in Equation (1) will not adequately account
for the propagation of error that results from movement in the OPs. To compute the tolerance interval in this situation, a
simulation-based approach is necessary. Briefly, one simulates a set of values for the OPs consistent with the expected movement
of the OPs within the OR. A regression model based on characterization data is then used to predict the value of the PP for
the simulated OP values. This process is repeated many times to construct an empirical distribution of the PP values. From
this simulated distribution, one selects the range that covers the desired proportion of the population. A more detailed algorithm
for this process is presented in the example at the end of the paper.
OTHER CONSIDERATIONS IN COMPUTING TOLERANCE INTERVALS
One issue of interest in any computation of a tolerance interval is the proportion of area contained in the interval and the
level of confidence that the reported interval is correct. We have found that two-sided intervals containing 99% (p = 99) of the population with an individual confidence level of 95% (α = 0.05) provide reasonable VAC limits. The decision
to include 99% of the population is based on the desire to have limits similar conceptually to those used in process control,
but not so wide as to be uninformative. In process control, limits are established to include approximately 99.7% of the data.
However, tolerance intervals that cover the middle 99.7% are extremely wide for data sets of the size typically available
from process characterization. The 99% coverage used in the tolerance interval represents a good compromise that provides
If there are many critical and key PPs, one may choose to adjust the individual confidence levels in order to obtain a desired
overall confidence level on the entire set of PPs. A simple method for handling this "multiplicity" problem is to use the
Bonferroni inequality.8 For example, assume it is required to have VAC for 10 key and critical PPs. In order to achieve an overall confidence of
at least 95% on the set of 10 PPs, individual tolerance intervals must be calculated with a confidence coefficient of:
100(1 – (0.05/10)) = 99.5%.