Steven Walfish

This article is the second in a four part series on essential statistical techniques for any scientist or engineer working
in the biotechnology field. This installment presents statistical methods for comparing sample means, including how to establish
the correct sample size for testing these differences. The difference between onesample, twosample, and ztest also are
explored.
HYPOTHESIS TESTING
In hypothesis testing, we must state the assumed value of the population parameter called the null hypothesis. The goal of
hypothesis testing is to verify if the sample data is part of the population of interest. You either have sufficient evidence
to accept the null hypothesis or reject it—you do not prove it. The significance level or pvalue indicates the likelihood
that the sample comes from the population of interest. Statisticians usually use a pvalue of 0.05 as the cutoff for statistical
significance. In other words, a pvalue less than 0.05 is sufficient evidence to reject the null hypothesis. Typically, the
null hypothesis is a statement about the value of the population parameter. For example, μ = 100 versus μ ≠ 100. A onesided
test means we are testing the null hypothesis of either less than or greater than. A twosided test means we are testing the
null hypothesis of less than and greater than.
ONESAMPLE TTEST
The onesample ttest is used to compare a sample mean to a hypothesized population mean. The hypothesis can be either a onesided
or twosided test. Usually, the population variance is unknown requiring use of the tdistribution, which takes into account
the uncertainty in estimating the sample variance. The tdistribution is tabled by confidence level and degrees of freedom.
For the onesample ttest, the degrees of freedom are the number of observations used to estimate the sample standard deviation
minus one. The formula for the onesample ttest is as follows:
in which Xmean is the sample mean, μ is the theoretical population mean, s is the sample standard deviation, and n is the
sample size used to estimate the mean and standard deviation.
Table 1. An example of a twosided one sample ttest for protein concentration. The hypothesis is that the lot is not statistically
different than 30 (μ = 30).

If the value of t* is greater than the tabled value from the tdistribution, the sample mean is statistically different than the population
mean (μ). An example of a onesample ttest would be comparing protein concentration for a particular batch to a theoretical
protein concentration. Table 1 shows an example of a twosided onesample ttest for protein concentration. The hypothesis
is that the lot is not statistically different than 30 (μ = 30). The mean of the six vials was not statistically different
than the theoretical value of 30 (p = 0.223). The t* of 1.355 did not exceed the tabled value for a 95% confidence level with five degrees of freedom of 2.571.