Statistical Equivalence Testing for Assessing Bench-Scale Cleanability - The two-one-sided t-test compares the equivalency of two data sets. - BioPharm International
Statistical Equivalence Testing for Assessing Bench-Scale Cleanability
The two-one-sided t-test compares the equivalency of two data sets.
 Feb 1, 2010 BioPharm International Volume 23, Issue 2

Case Study 1: Products A and B are Not Equivalent

 Table 1. Upper and lower confidence limits of the difference between two groups as determined using the two-one-sided t-test (TOST)
Two protein products were cleaned using the bench-scale method. A total of 18 data points (for cleaning time) were recorded for each product. Commercially available statistical software (JMP) was used to perform the TOST analysis.12 The one-way analysis "Fit Y by X" function was used with a set alpha level (probability of type 1 error) of 0.1, which represents the 90% confidence interval discussed earlier. Figure 3 shows the distribution of cleaning times for the two products. The box and whisker plot (in red) represents the range and distribution of the data points. The box contains the middle 50% of the data and the line across the middle of the box represents the median of the data set. The difference between the quartiles is the interquartile range. Each box has whiskers that extend from the edge of the box to the outermost data point that falls within the boundary defined by upper quartile + 1.5*(interquartile range) and lower quartile –1.5*(interquartile range).

Table 1 shows the output of the TOST analysis performed using JMP. The difference between two group means represents the point estimate of the true difference between the two means. This can be calculated by subtracting the sample mean for data set A from the sample mean for B. The standard error (SE) of the difference between two group means can be calculated by applying the following equation:

in which sA is the standard deviation of group A, nA is the sample size of group A, and sB and nB represents the corresponding values for product B. This value provides an estimate of the variability of the difference between the two data sets. The degrees of freedom are adjusted based on the variability of each data set, which is determined by the statistical software (JMP) using the Satterthwaite approximation.11 The 90% confidence interval for the difference between two means is reflected by the upper confidence limit difference of 70.36 and the lower confidence limit difference of 62.91 of the two group means. Because the equivalence limit is ±4.48, and the upper and lower confidence limit of the difference between two means fall outside the set equivalence limit, it is concluded that product A and product B are not equivalent. Based on the average cleaning time and confidence interval, product B is considered more difficult to clean than product A.

In this case study, the products failed to meet cleanability equivalency mainly because of the large difference (66.64 min) in the mean cleaning times, as shown by the blue bar in Figure 2. It is also possible to fail the equivalency test when the two group means are similar but product B has a high degree of variability, resulting in broad confidence intervals as the one shown by the red bar in Figure 2. In such a scenario, the variability in product B should be further evaluated and the outcome of the cleanability ranking (B<A or B>A) can be made based on an appropriate risk assessment and business considerations.

Case Study 2: Product A and Y are Equivalent

 Figure 4
The TOST analysis, as described in the previous case study, was repeated for two other products. Figure 4 shows the distribution of cleaning times for these two products: A and Y.

 Table 2. Upper and lower confidence limits of the difference between two groups as determined using the two-one-sided t-test
Table 2 shows the output of the TOST analysis using JMP. The 90% confidence interval for the difference between two means is reflected by the upper confidence limit difference of 1.5547 and the lower confidence limit difference of 0.0564 of the two group means. Because the equivalence limit is ±4.48, the upper and lower confidence limits of the difference between two means fall within the equivalence limit. It is therefore concluded that product A and product Y are equivalent to each other in terms of cleanability.