Import the Data

Use the read.csv() command to read a .csv file (comma separated values) into an RStudio session. Copy, paste, and run the line below in your session to load the governors data and give the data frame the name gov in your session.

gov = read.csv("https://mphitchman.com/stats/data/gov24.csv")

Q1

  • nrow(gov) returns the number of rows (observations) in the data frame gov
  • ncol(gov) returns the number of columns (variables), and
  • dim(gov) will do both at once (rows then columns).

Q2

  • table(gov$party) summarizes the party variable of the data frame gov

Q4

Two way table: table(gov$party,gov$miss_riv)

Q6

FYI here’s code in base R for a stacked bar plot similar to what appears in the worksheet

barplot(table(gov$party,gov$region),
        col=c("blue","red"),
        main="Governors by Party and Region",
        legend=TRUE)

And here is code using ggplot for generating the bar plot actually appearing in Q6, which assumes you’ve loaded the tidyverse:

party_colors <- c("#2E74C0", "#CB454A")  # choosing classic blue and red colors for the parties

ggplot(gov,aes(x=region,fill=party))+
  geom_bar()+
  scale_fill_manual(values=party_colors)+
  scale_y_continuous(breaks=seq(0,12,2))+
  ggtitle("Governors by Party and Region")

Q7

Side-by-side box plots using base R:

boxplot(gov$age.at.inaug~gov$miss_riv,horizontal=TRUE,xlab="age at inauguration",ylab="",main="Age of Governors east and west of the Mississippi")

Nicer side-by-side box plot, using ggplot, this code assumes you’ve loaded the tidyverse into your session.

ggplot(gov,aes(x=age.at.inaug,y=miss_riv))+
  geom_boxplot(fill=c("orange","seagreen"))+
  xlab("age at inauguration")+
  ylab("")+
  ggtitle("Age of Governors east and west of the Mississippi")