Scenario 1: The tolerance intervals described in this section can be used when a limited data set, such as data from only large-scale
runs, are available for setting VAC. Wald and Wolfowitz4 introduced the notion of two-sided tolerance intervals in the case of a random sample selected from a single population.
They provided approximate formulas that were later modified by Howe.5 This interval contains 100p% of the population with 100(1 – α)% confidence and is defined as
in which S is the sample standard deviation, Y is the sample mean, r is the error degrees of freedom, c is the number of observations used to compute the center, Y mean Z(p + 1)/2 is the standard normal percentile with area (p + 1)/2 to the left, and X2r,α is the chi-squared percentile with r degrees of freedom and area α to the left. If Equation (1) is used to compute a tolerance interval for a simple random sample
of n observations, then r = n – 1 and c = n. Equation (1) has previously been recommended for setting VAC in this scenario.6 Tabled values for tolerance intervals are also available.7
Scenario 2: In this scenario, data from both bench-scale process characterization and large-scale are available. By combining process
characterization data with large-scale data, sample sizes on which tolerance intervals are based can be increased. Additionally,
the modeled regression relationships between PPs and OPs provide valuable information that yield more realistic VAC limits.
Figure 1 shows a graphical representation of how tolerance intervals are estimated using the regression approach.
In this example, as the coded value of OP shifts from –1 to +1 (where zero is the setpoint condition), the range that contains
99% of the population PP values shifts up due to the positive linear relationship between PP and OP. Note that although the
centers of the intervals that include the middle 99% of the PP values differ as the OP changes, the lengths of the intervals
are constant. This is because the regression model assumes the spread (standard deviation) of the PP values is constant across
the examined range of the OP. (One must verify this assumption during data analysis.)