How To: Statistics – One Sample t-Tests
One thing that regularly stumps scientists is the handling of data. We seem to be very good at generating obscene amounts of it, but representing it meaningfully can be a little off-putting – unless you happen to be a bioinformatician. Let’s wet our toes with a simple One Sample t-test to see how we can easily incorporate statistical analysis into our work.
Of course, the calculations involved can be done on a simple calculator but your task will be made much easier with the use of spreadsheet software (Excel) or more specialized tools which are available as most high schools and universities (Minitab, Prism, SigmaPlot, etc.).
Each will have their own tutorials on carrying out these tests and so this article will not be heavily technical but rather focus on the correct application of statistical testing.
Table of Contents
Table of Statistical Analysis
|Test||When To Use||An Example|
|1 sample t-test||Tests if the mean of a single population is equal to the hypothesized value||A lecturer claims that the mean time taken to complete a quiz is 1 hour. From a sample data set, can we reject this claim?|
|2 sample t-test||Tests if the difference between means of two independent populations is equal to hypothesized value||Does the mean quiz score of female students differ significantly from the mean quiz score of male students?|
|paired t-test||Tests if the difference between means of dependent or paired observations is equal to a hypothesized value||The mean response time of adults before and after they have consumed alcohol. Is the difference significant enough to conclude that alcohol affects response time?|
|ANOVA||Tests for statistical difference among means for more than two populations||Studying the effectiveness of three types|
of pain reliever:
aspirin vs. tylenol vs. ibuprofen
When to Use a t-Test?
A t-test is a form of hypothesis testing that uses a set of sample data to test a hypothesis for the entire population. It is used when the population standard deviation (σ) is unknown and the sample size is small (n<30). In real-world samples, we don’t usually have a basis for knowing σ.
Since the t-distribution becomes equivalent to the normal distribution (bell curve) when the sample size is large, the correct practice is:
- If σ known, use the normal distribution.
- If σ unknown:
- If n>30, use the normal distribution.
- If n<30, use a t-distribution.
A one-sample t-test can determine whether μ (mu, the population mean) is equal to a hypothesized mean. The test uses s (sample standard deviation) to estimate σ (sigma, the population standard deviation). If the difference between x̅ (x bar, sample mean) and the hypothesized mean is large relative to s, then the means are unlikely to be equal.
The confidence interval is usually defined before hypothesis testing. With a 95% confidence interval for x̅, you can be 95% confident that the returned range of values is contained within μ.
Generally, confidence intervals of 95% are used unless otherwise stated – sometimes α (alpha, the significant level) is used to describe this (for 95% CI, α = 0.05).
It is important to note that using the t-test for hypothesis testing requires the adoption of certain assumptions about the data being analyzed. If these assumptions are not met, then the conclusions obtained from the test cannot be validated. The assumptions for a one-sample t-test are:
- The sample must be random.
- Sample data must be continuous.
- Sample data should be normally distributed (although this assumption is less critical when the sample size is 30 or more).
Example: One-Sample t-Test
For example, you want to determine whether the mean time for completing an online quiz is statistically different from the lecturer’s claim of 1 hour. μ, in this case, represents the mean time taken by the entire cohort of students to finish the quiz. However, our sample size consists of only 7 students.
Now we are interested if μ is either equal to 1 hour or it is not. Therefore the possibilities can be encompassed within two hypotheses:
- The null hypothesis (H0): μ is equal to 1 hour.
- The alternative hypothesis (H1): μ is not equal to 1 hour.
Using software to generate this data will yield several key parameters such as the sample mean, sample standard deviation, confidence interval, T-statistic and p-value. A sample data set (n=7) has been generated below:
Test of μ = 1 vs μ ≠ 1 for n = 7
|Variable||n||Mean||St Dev||95% CI (lower, upper limit)||T-statistic||p-value|
The key parameter here is the p-value (probability value), and answers the question ‘What is the probability that the sample mean calculated fulfills the null hypothesis, taking into account sample size and standard deviation?’
If the p-value is larger than 0.05 then the null hypothesis cannot be rejected (0.089 > 0.05). Therefore, we do not have enough evidence to suggest that the lecturer’s claim of the online quiz taking 1 hour is false.