Are older actors celebrated more than older actresses in Hollywood? Do older actors find more (serious, Oscar-worthy) roles available to them than older actresses? One way to investigate these questions, perhaps, is to compare the age of best actor winners at the Oscars to the age of best actress winners.
In this page we look at the raw data, and use RStudio to help us conduct two different hypothesis tests. In class we will discuss which one seems more appropriate, and what that test tells us about our initial questions, if anything.
I encourage you to follow along in your own RStudio session by copying and pasting the code appearing in what follows
df <- read.csv("https://mphitchman.com/stats/data/oscars.csv")
The above code will impart the dataframe (which we call
df
) into your RStudio session.
By clicking on df
in your Environment Tab you can check
the data set. We have 190 rows and 8 columns. There have been 95 Oscar
Awards Ceremonies, so the 190 rows contain the 95 Best Actor Award
Winners and the 95 Best Actress Award Winners.
Who won these awards during the 96th Oscars Award show?
df[df$ceremony==96,]
## ceremony name Film award date_of_award years days
## 97 96 Cillian Murphy Oppenheimer best actor 10-Mar-24 47 290
## 194 96 Emma Stone Poor Things best actress 10-Mar-24 35 126
## age_at_award
## 97 47.79
## 194 35.34
Cillian Murphy and Emma Stone! Notice that Emma Stone is nearly 12 years younger than Cillian Murphy.
We can explore the data efficiently with plots, and, as always, we
turn to ggplot, which we load by loading the tidyverse with the command
library(tidyverse)
ggplot(df,aes(x=age_at_award,y=award))+
geom_boxplot()
Do best actress award winners tend to by younger than best actor award winners?
ggplot(df,aes(x=age_at_award,y=award,group=award))+
geom_boxplot()+
geom_jitter(height=.2,col = 'seagreen',alpha=.4)
ggplot(df,aes(x=ceremony,y=age_at_award,group=award))+
geom_point(aes(color=award))+geom_line(aes(group = ceremony))
Let’s record the age of the best actor award winners in a vector
called actor
by taking those rows in df
that
correspond to best actor awards and focusing on column 8 (which is the
age_at_award
column).
We record the best actress award winners in the actress
vector in a similar manner.
actor <- df[df$award=="best actor",8]
actress <- df[df$award=="best actress",8]
n1=length(actor)
x1=mean(actor)
s1=sd(actor)
n2=length(actress)
x2=mean(actress)
s2=sd(actress)
We display these summary statistics in a nice table, and if you run the lines above in your session you will find these values appearing in your Environment tab
Sample | size | mean | s |
---|---|---|---|
actor | 97 | 45.1 | 9.6 |
actress | 97 | 37.7 | 12.2 |
It appears from these summary statistics that best actress winners tend to be younger, suggesting, perhaps, that more serious “Oscar worthy” roles are available for older actors than for older actresses.
But we are not content with just comparing sample means. We conduct a test of significance!
Let \(\mu_1\) denote the (theoretical) population mean average age of best actor award winners, and \(\mu_2\) denote the (theoretical) population mean average age of best actress award winners. We test whether there is no (theoretical) difference in these ages against the alternative that there is a difference:
\[ \begin{cases} H_o: \mu_1 - \mu_2 = 0\\ H_a: \mu_1 - \mu_2 \neq 0 \end{cases} \]
We find our test statistic from the summary statistics, which we view as a \(t\)-score living in a \(t\) distribution with 96 degrees of freedom (smaller sample size minus 1.)
t = (x1-x2)/sqrt(s1^2/n1+s2^2/n2)
This value, 4.68, seems like a high \(t\)-score, which means that under the assumption that \(H_o\) is true, our data would be very extreme (far from “typical”).
Here’s the p-value:
2*pt(-abs(t),n1-1)
## [1] 9.423988e-06
This p-value is expressed in scientific notation, and is equal to 9.42 \(\times 10^{-6}\) = .00000878, which is so small that we reject \(H_o\) in favor of the alternative. The likelihood of getting sample means so different from each other if the two samples were drawn from populations having the same mean is so small (p-value = 9.4239878^{-6} \(<\) .01) that we reject the hypothesis that the population means are equal in favor of the alternative that they are different.
t.test(x=actor,y=actress,alternative="two.sided")
##
## Welch Two Sample t-test
##
## data: actor and actress
## t = 4.6796, df = 181.39, p-value = 5.608e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 4.270379 10.497044
## sample estimates:
## mean of x mean of y
## 45.10072 37.71701
Notice, this \(t\)-test uses a different value for the degrees of freedom as discussed in the inference on two means webpage, so the p-value is slightly different (smaller, in fact).
The data have a natural pairing, by the year of the award ceremony. Would it be more appropriate to do a 1-sample \(t\)-test on the matched differences?
We define a vector diff to capture the difference in the ages (actor minus actress) at each of the Oscar award ceremonies.
diff<-actor-actress
To get a flavor for the difference data, here’s a quick plot
plot(diff,pch=16,ylab="actor-actress",xlab="Ceremony")
abline(h=mean(diff),lty=2,col="brown3")
abline(h=0)
The sample mean difference is 7.38, which is marked in the plot above as the dashed, red horizontal line. Here’s a histogram of the differences as well:
hist(diff,breaks=30,xlab="actor-actress")
The sample standard deviation of the differences is 14.09, and the relevant test statistic for a matched pairs test in which the null hypothesis is that the (theoretical) mean difference in ages is 0, would be
(t = mean(diff)/(sd(diff)/sqrt(length(diff))))
## [1] 5.15944
But again, we can ask R to do the entire matched pairs test of our choosing (two-sided alternative in this case).
t.test(x=actor,y=actress,alternative="two.sided",paired=TRUE)
##
## Paired t-test
##
## data: actor and actress
## t = 5.1594, df = 96, p-value = 1.333e-06
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## 4.542986 10.224437
## sample estimates:
## mean difference
## 7.383711
Recall, this matched pairs test is equivalent to the 1-sample \(t\)-test on the difference data:
t.test(x=diff, alternative="two.sided")
##
## One Sample t-test
##
## data: diff
## t = 5.1594, df = 96, p-value = 1.333e-06
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 4.542986 10.224437
## sample estimates:
## mean of x
## 7.383711