Draw two samples from Hitchman’s rectangles.

First, open RStudio and create a new script in which to copy and paste useful code for this activity.

You will now draw two random samples from the population of Hitchman’s 2000 rectangles. The first is a random sample of size \(n=16\) (which we name sample1); the second is a random sample of size \(n = 100\) (sample2).

Run these three lines of code to gather your two samples.

df <- read.csv("https://mphitchman.com/stats/data/mph_rectangles.csv")
sample1 <- df[sample(1:2000,16),]
sample2 <- df[sample(1:2000,100),]

Running these three lines of code will create three data frames in your RStudio session. Check the environment tab to make sure they’re there.

Answer the following questions

In each case write and run code in your script to determine the appropriate value.

  1. With sample1 determine these values: the sample mean of your rectangle lengths, the sample mean of your widths, and the sample proportion of red rectangles.
  1. Based on sample1, determine a 95% confidence interval for the population proportion of red rectangles.
  1. Based on sample1, determine a 95% confidence interval for the population mean length of the rectangles. Assume the population standard deviation of rectangle lengths is \(\sigma_L = 0.5\).
  1. Based on sample1, determine a 95% confidence interval for the population mean width of the rectangles. Assume the population standard deviation of rectangle widths is \(\sigma_W = 0.87\).
  1. Are these three confidence interval reliable? That is, does sample1 meet the sampling conditions required for our confidence interval formulas to be reliable?
  1. Repeat questions 1-5 with sample2.

Here is some code that can help with confidence interval calculations:

Confidence interval for a proportion

#confidence interval for proportion.
n=64 # enter your sample size here
x=29 # enter your number of successes here
C = .95 # enter your confidence level here
phat = x/n #this calculates your sample proportion
SE = sqrt(phat*(1-phat)/n) #this calculates your standard error (SE)
zstar = qnorm(C + (1-C)/2) #z score associated with given confidence level (1.96 for 95%)
MOE = zstar*SE #your margin of interval
phat + c(-MOE,MOE) #Confidence Interval (low to high)

Confidence Interval for a mean (when we know \(\sigma\), the population standard deviation)

n=64 #sample size
xbar=10.2 #sample mean
sigma = 2 #population standard deviation (if you happen to know it!)
C = .95 # confidence level
SE = sigma/sqrt(n) #standard error
zstar = qnorm(C + (1-C)/2) #z score associated with given confidence level (1.96 for 95%)
MOE = zstar*SE
xbar + c(-MOE,MOE) #Confidence Interval (low to high)