A plot of the data suggests that the lack of Normality may be partially due to outliers. These are values so extreme that
they are unlikely to belong with the other values. Tests for outliers are described in the NIST/SEMATECH e-Handbook.3 Grubb's test identified two values, 230 and 241, as potential outliers. Both were much larger than the next highest value,
181. The data were reviewed and it was decided that these two values might have been recording errors that should have been
130 and 141. They were thus removed from the data and the Anderson-Darling test was repeated. This showed that the distribution
of the 60 values was not significantly different from a Normal. The mean and standard deviation for the specification limit
were calculated using the 60 values and an upper limit was calculated.
In this example, removing the outliers was very effective in dealing with a distribution that was significantly different
from a Normal. Values flagged as potential outliers should not be removed without a review of the data, particularly when
the data are from preproduction runs that may not fully represent the range to be found in future. If it is decided not to
remove the outliers, or if no outliers are found, the pressure to produce a specification limit may force us to calculate
the mean and standard deviation even though the Anderson-Darling test shows that the distribution is significantly different
from a Normal. The resulting specification limit would have additional uncertainty but it could be used temporarily and recalculated
when more data become available.
Acceptance limits for the largest and smallest of Normally distributed measurements
Sometimes the rules for accepting batches are something like the following: "The mean of a sample of 30 parts should be between
LL and UL and also the largest value should not be more than X and the smallest value not less than Y." For example, the means
must be greater than 50 and less than 60 and also no individual part in a sample of 30 can be smaller than 47 or larger than
These rules require acceptance limits for the means and for individual parts. These limits are calculated separately using
the tolerance interval approach. The total standard deviation (σT) is used for the individual values and the standard deviation of the means (σM) is used for the means. Multipliers of 3.0 are typically used for both because several hundred parts are measured.
We can use Excel to calculate the expected failure rate for each of the specification limits: NORMDIST(LSL, μ, σ,TRUE) for
the lower limit and (1 – NORMDIST(USL, μ, σ,TRUE) ) for the upper limit. If we use 3-sigma limits for LL and UL, both expected
failure rates are 0.0013 (0.13%).
Since there is only one mean per batch, the limits for the means lead to an expected batch failure rate of 0.13%. However,
since each part in the sample of 30 has a 0.0013 chance of being above the upper limit we can calculate that the acceptance
limit for individual parts leads to an expected batch failure rate of 3.8%.
1 – (Chance that none of 30 parts are above the upper limit) = 1 – (1 – 0.0013)30 = 0.038 (or 3.8%).
1 – (Chance that none are below the lower limit) = 1 – (1 – 0.0013)30 = 0.038.
To have a batch failure rate of 0.13% based on the individual parts, we need to select values of UL and LL such that:
1 – (1 – (1 – ND(UL)))30 = 0.0013 and 1 – (1 - ND(LL))30 = 0.0013.
In these equations, ND(X) is used to denote the Excel statement NORMDIST(X,μ, σ,TRUE).