Suppose we have gathered independent samples from more than two populations and we wish to test the null hypothesis that all population means are equal against the alternative that they are not all equal. The test for us is ANOVA (Analysis of Variance).
This page provides some R commands to help us through the ANOVA worksheet.
To import the stress
data set, either run this
command:
stress <- read.csv("https://mphitchman.com/stats/data/stress.csv")
Click on the stress data frame in the Environment tab to get a look at how the data has been formatted. Having the data in this long form (as opposed to having separate columns for each of the three groups) makes it much easier to use R commands to run ANOVA and visualize the data.
The code below makes use of packages in the tidyverse, so be sure to load it into your session:
library(tidyverse)
The following code makes boxplots of the heart rates for each of the three treatments, and colors them by treatment too. Feel free to add axis labels and plot title
ggplot(stress,aes(y=heart.rate,x=treatment))+
geom_boxplot(aes(fill=treatment),show.legend=FALSE)+
ggtitle("Plot title would go here")
stress %>%
group_by(treatment) %>%
summarise(sample_size = length(heart.rate),
mean=mean(heart.rate),
stdev=sd(heart.rate))
anova(lm(heart.rate~treatment,stress))