Use the read.csv()
command to read a .csv file
(comma separated values) into an RStudio session. Copy, paste, and run
the line below in your session to load the governors data and give the
data frame the name gov
in your session.
gov = read.csv("https://mphitchman.com/stats/data/gov24.csv")
nrow(gov)
returns the number of rows (observations) in
the data frame gov
ncol(gov)
returns the number of columns (variables),
anddim(gov)
will do both at once (rows then columns).table(gov$party)
summarizes the party
variable of the data frame gov
Two way table: table(gov$party,gov$miss_riv)
FYI here’s code in base R for a stacked bar plot similar to what appears in the worksheet
barplot(table(gov$party,gov$region),
col=c("blue","red"),
main="Governors by Party and Region",
legend=TRUE)
And here is code using ggplot
for generating the bar
plot actually appearing in Q6, which assumes you’ve loaded the
tidyverse
:
party_colors <- c("#2E74C0", "#CB454A") # choosing classic blue and red colors for the parties
ggplot(gov,aes(x=region,fill=party))+
geom_bar()+
scale_fill_manual(values=party_colors)+
scale_y_continuous(breaks=seq(0,12,2))+
ggtitle("Governors by Party and Region")
Side-by-side box plots using base R:
boxplot(gov$age.at.inaug~gov$miss_riv,horizontal=TRUE,xlab="age at inauguration",ylab="",main="Age of Governors east and west of the Mississippi")
Nicer side-by-side box plot, using ggplot
, this code
assumes you’ve loaded the tidyverse
into your session.
ggplot(gov,aes(x=age.at.inaug,y=miss_riv))+
geom_boxplot(fill=c("orange","seagreen"))+
xlab("age at inauguration")+
ylab("")+
ggtitle("Age of Governors east and west of the Mississippi")