Presenting Data Effectively - The key to a good graphical presentation is to select the method that best fits the data. - BioPharm International
Presenting Data Effectively
The key to a good graphical presentation is to select the method that best fits the data.
 Feb 1, 2008 BioPharm International Volume 21, Issue 2

QUANTITATIVE DATA

Quantitative, or continuous, data are preferred because of their ability to estimate and predict the true population values. Though qualitative data can be used to estimate and predict the true population values, they typically require larger sample sizes to accomplish the task. There are several summary statistics that are used with quantitative data. The most common is the mean or average of the data. Another estimate for central tendency is the median, or 50th percentile of the data. Even though the mean is the most widely used, it is not appropriate for highly skewed distributions and is less efficient than other measures of central tendency when extreme scores are possible. The median is useful because its meaning is clear and it is more efficient than the mean in highly-skewed distributions. Another good estimate for the central tendency is the geometric mean if all the values are positive and the distribution has a positive skew. The geometric mean is computed by taking the average of the logarithms of all the values and raising the base of the logarithm used to the resultant average. If the distribution is skewed positively, the mean will be larger than the median; if it is skewed negatively, the mean is smaller than the median. When a distribution is symmetrical, the mean and the median are equal.

The standard deviation or the square root of the variance is by far the most widely used measure of spread. The variance is the average squared deviation from the mean of the data. A key point to remember is that the variance can be averaged but the standard deviation cannot.

The range is another estimate of the dispersion of the data, but it takes into account only two scores, the maximum and minimum value. A very handy method for comparing variability is the coefficient of variation (CV), sometimes called the relative standard deviation (RSD). The coefficient of variation measures variability in relation to the mean and is used to compare the relative dispersion in one type of data with the relative dispersion in another type of data. The data to be compared may be in the same units, in different units, with the same mean, or with different means.

There are several methods to graphically display quantitative data. The most common methods include the line plot, box and whisker, and histogram.

A PICTURE IS WORTH A THOUSAND WORDS

 Figure 3. An example of a line plot. The horizontal line is the mean of the 30 lots.
Graphing data makes it easier to see patterns in the data and to confirm assumptions about the distribution of the results. A line plot is a two- dimensional plot of data, usually over time, used to detect trends in the data. Line plots are used in conjunction with other statistical techniques such as control charts for process control. A control chart is a line plot with statistical limits set at ±3 standard deviations from the mean. Based on the normal distribution, 99.7% of the data should be within these limits. Figure 3 is an example of a line plot. The horizontal line is the mean of the 30 lots.

 Figure 4. An example of a box plot
A box and whisker, or simply a box plot is a graphical representation of dispersion of the data. Figure 4 represents the lower quartile (Q1), upper quartile (Q3), and median. The box includes the range of scores falling into the middle 50% of the distribution. The whiskers i.e, the vertical lines extending from the box usually are set at 1.5 times the interquartile range (Q3–Q1). Points that are outside of the whiskers are usually candidates for outlier analysis. The box plot also can be used to compare different lots or batches. A t-test would be used to statistically compare two different lots. If you have more than two lots to compare, a one-way analysis of variance (ANOVA) would be used.