Cumulative Sum Charts for Problem Solving - A retrospective analysis for problem solving using cumulative sum charts. - BioPharm International


Cumulative Sum Charts for Problem Solving
A retrospective analysis for problem solving using cumulative sum charts.

BioPharm International
Volume 21, Issue 5


Figure 7. A CUSUM plot for production and assay reference data. The lack of correspondence between the plots gives reason to eliminate the reference assay as the cause of the problem.
The CUSUM chart for production data shows a definite change on August 28 or 29 and possible changes at a few other dates. The problem-solving process uses this information to confirm or eliminate potential causes. Figure 7 shows the CUSUM plots for production data and for the assay reference standard, which may be a cause.

The two plots have a similar shape but the key change date for the production measurements does not have a complementary one for the assay reference. An obvious change point for the reference standard on August 28 (or on a slightly earlier date) would have confirmed that it was worthwhile to investigate this potential cause in more depth. The reference does rise (the slope is upwards) starting on August 19 but similar short periods of high values in late June and late July and very high values in October do not have a consistent effect on the production measurements. The differences between the two plots give reason to eliminate the reference assay as the cause of the problem.

The best interpretation of the shape of the CUSUM plot for the reference assay is that it was on target up to May, was low during June, and then it was slightly high. The other twists and turns in the line are probably just random variation and serve to illustrate that it is possible to read a lot into a chart when nothing has really happened. If you think you might be falling into this trap, then plot the raw data behind the CUSUM (as in Figure 5) because it is harder to imagine a pattern or shape in the conventional trend chart.


The tests described here can be used to support the interpretation of a CUSUM chart, though it should be remembered that the key purpose of the chart is to identify retrospectively when a change occurred. Judgment, knowledge of the process, and other events around the process are key to interpretation. The results of statistical tests in these situations are not a core requirement as in a clinical trial; they are an aid to problem solving. It is often enough to trust one's eyes, especially if a CUSUM plot looks like two straight lines as has happened with these data.

T test to Compare Two Periods of Data

Figure 8. A t test for difference in averages
The t test is used to check for a difference in averages between two sets of data. Figure 8 shows how this is done in Excel using the TTEST function:
  • Array 1 contains the cells with data for one part of the chart, in this case March 9 to August 28. The numbers to the right are the first data points in the selected range.
  • Array 2 contains the cells with data for a second part of the chart (in this case, August 29 to October 21)
  • Tails refers to a one-tailed (is it bigger?) or a two-tailed (are they different?) test. We have seen a difference and want to check its significance, so choose a one-tailed test
  • Type 2 is a test that allows a different number of points in each array and assumes that the standard deviations in the two sets are the same. If you have reason to think that the standard deviation and the average have changed, then put in Type 3.

The t test estimates the random, background variation from the spread of results in each array. It then compares the difference between the average of the first and second array with this variation and calculates the likelihood that it is just random. The answer given by the test is the probability that a difference this big would have occurred by chance. Values greater than 0.1 are generally considered statistically insignificant. If you get an answer of 0.05 and you believe that the difference is real, then you carry a one in 20 risk that you are mistaken, an equivalent statement is that you are 95% confident that the difference is real. Small values of 0.01 or less suggest that the difference is real or that you have been unlucky.

The increase in average seen by comparing data up to August 28 with data from August 29 onwards is almost certainly real because the chance that it would have occurred through random variation is very low (the result is 7.76 x 10–18).

There is a major risk in applying the t test to a historical analysis, such as the one here, in which we have chosen a change point that already looks important, and are then attempting to show that it is statistically significant. The answer in such a situation is often one which confirms our subjective judgment. If we select a period when the average is high and compare it with a period when the average is low then a statistical test will almost always say that the difference is real, particularly if the number of samples in each average is high. It would be a serious mistake to perform t tests on many groupings of data until one is found to be significant; a significance level of one in 20 will be found in about one in 20 random data sets. As stated above, the visual inspection of the CUSUM plot is primary and the statistical test is secondary. A negative result (probability or p-value of more than about 0.1) does, however, give a strong indication that the observed, potential difference is just a random variation.

Sequential Tests

An alternative approach is to continually apply a test to each new data point and check if it is significantly different from its predecessors. This is the principle behind statistical process control (SPC). The advantage of SPC over the t test is that the data indicate when there has been a significant change and the issue of false positive results does not arise.

Figure 9. Statistical process control limits for individual values
The chart for individual values in Figure 9 has control limits that have been calculated using standard methods from SPC that are based on the overall average and the moving range or difference between each sequential pair of values. The point for August 29 and many of those that follow are above the upper control limit, confirming that the average has risen.

blog comments powered by Disqus



Bristol-Myers Squibb and Five Prime Therapeutics Collaborate on Development of Immunomodulator
November 26, 2014
Merck Enters into Licensing Agreement with NewLink for Investigational Ebola Vaccine
November 25, 2014
FDA Extends Review of Novartis' Investigational Compound for Multiple Myeloma
November 25, 2014
AstraZeneca Expands Biologics Manufacturing in Maryland
November 25, 2014
GSK Leads Big Pharma in Making Its Medicines Accessible
November 24, 2014
Author Guidelines
Source: BioPharm International,
Click here