ABSTRACT
Production lots and tests with a specification range of two standard deviations will produce random rejections five percent
of the time, as a result of extreme statistical variation. Techniques based on sound statistical reasoning were developed
to deal with out-of-specification (OOS) test results. The temptation to bend the rules and lower the reject rate led to abuses,
however. The most common of these was to test a sample repeatedly until a passing result was produced. In 1993, Barr Laboratories
lost a lawsuit on this and related points and the judge's decision led to new interpretations of FDA rules, including the
requirement that an investigation be initiated before a replicate sample can be tested. These rules and others incorporated
into FDA guidance documents reflect a misunderstanding of important statistical principles.

|
Dealing with out-of-specification (OOS) test results has been a general manufacturing concern for more than 80 years.1 It arises because it is statistically plausible that five percent of lots and tests will fall outside accepted limits, even
if the product actually meets specifications.
The bigger problem is that many manufacturers have incorrectly applied retesting procedures and averages. Such erroneous application
of statistical methods is probably due, in some cases, to poor training in mathematics and unethical efforts to avoid discarding
lots, in others. The most significant abuse of statistical methods has been to test lots repeatedly until a sample falls within
the specification range, and then to accept a lot based on one passing result. This method is known as "testing into compliance."
This approach to OOS results became a major problem following the 1993 lawsuit between the US government and Barr Laboratories.2 Peculiar judicial conclusions and subsequent US Food and Drug Administration (FDA) actions created a major problem out of
a minor quality control (QC) problem. In this article, we trace this history with an emphasis on the 15 years since the Barr
Decision. A key part of the story is that poor training in mathematics and a lack of statistical thinking combine to confuse
workers.
BACKGROUND
Before discussing the history of the out-of-specification (OOS) problem, it is useful to examine some basic tenets underlying
lot release testing and the use of statistics.
All Measurements are Approximate
Scientists realize that all measurements are uncertain at some level and are taught that the standard deviation is the parameter
that estimates the degree of this uncertainty. For the pharmaceutical analyst, this idea is very important when making quality
control (QC) measurements because the analyst must balance the cost of making measurements against the needed level of certainty.
Unlike their counterparts in academia, industrial QC analysts are not expected to produce test results that are accurate and
precise to the maximum number of significant figures that are possible. In most cases, the analyst's supervisors will not
provide the equipment or the time to make measurements of that type, but will only provide what is necessary to determine
if a product lot meets specifications.
Of course, the occurrence of OOS results also raises the question of whether the specifications themselves have been properly
set. If the specifications are set improperly, we will consistently see OOS results, because the manufacturing process itself
cannot meet the specifications that were set for it. This article does not deal with such circumstances, however; the OOS
problem addressed here applies to stable and controlled processes with realistic requirements, in which an OOS result is a
rare event.
Variability Can be Measured
The experienced QC scientist knows, when setting specifications, that individual units of a product will vary because of process
variations that affect both samples and whole lots. In addition, variation in the test method itself is layered on top of
process variations. Therefore, the result of a single test is affected by multiple sources of variation, and may be misleading
unless the degree of variation arising from the different sources is understood. That is why a specification has ranges.