Steven Walfish

This article is the second in a four part series on essential statistical techniques for any scientist or engineer working in the biotechnology field. This installment presents statistical methods for comparing sample means, including how to establish the correct sample size for testing these differences. The difference between onesample, twosample, and ztest also are explored.
HYPOTHESIS TESTING
In hypothesis testing, we must state the assumed value of the population parameter called the null hypothesis. The goal of hypothesis testing is to verify if the sample data is part of the population of interest. You either have sufficient evidence to accept the null hypothesis or reject it—you do not prove it. The significance level or pvalue indicates the likelihood that the sample comes from the population of interest. Statisticians usually use a pvalue of 0.05 as the cutoff for statistical significance. In other words, a pvalue less than 0.05 is sufficient evidence to reject the null hypothesis. Typically, the null hypothesis is a statement about the value of the population parameter. For example, μ = 100 versus μ ≠ 100. A onesided test means we are testing the null hypothesis of either less than or greater than. A twosided test means we are testing the null hypothesis of less than and greater than.
ONESAMPLE TTEST
The onesample ttest is used to compare a sample mean to a hypothesized population mean. The hypothesis can be either a onesided or twosided test. Usually, the population variance is unknown requiring use of the tdistribution, which takes into account the uncertainty in estimating the sample variance. The tdistribution is tabled by confidence level and degrees of freedom. For the onesample ttest, the degrees of freedom are the number of observations used to estimate the sample standard deviation minus one. The formula for the onesample ttest is as follows:
in which Xmean is the sample mean, μ is the theoretical population mean, s is the sample standard deviation, and n is the sample size used to estimate the mean and standard deviation.
Table 1. An example of a twosided one sample ttest for protein concentration. The hypothesis is that the lot is not statistically different than 30 (μ = 30).

If the value of t* is greater than the tabled value from the tdistribution, the sample mean is statistically different than the population mean (μ). An example of a onesample ttest would be comparing protein concentration for a particular batch to a theoretical protein concentration. Table 1 shows an example of a twosided onesample ttest for protein concentration. The hypothesis is that the lot is not statistically different than 30 (μ = 30). The mean of the six vials was not statistically different than the theoretical value of 30 (p = 0.223). The t* of 1.355 did not exceed the tabled value for a 95% confidence level with five degrees of freedom of 2.571.