Homework 6
Due Friday May 10

1.
a) Use R to find the 10th and 90th quantiles of a standard normal distribution.
b) Find the 10th and 90th quantiles of a uniform distribution on the interval (0,1).
NOTE: for (a) you can use the table in A.3 to find the value of z such that P(Z < z) = .10.  Or you can use the qnorm(p,mean,sd) command. For the uniform, use the qunif(p) command.

2. The qqnorm(x) function compares the sample quantiles of x to the theoretical quantiles of a N(0,1) distribution.  If x comes from a N(0,1) distribution, then the resulting plot is roughly a straight line with intercept 0 and slope 1.
a) Suppose x comes from a N(10,1) distribution.  How would this affect the slope?  the intercept?
b) Suppose x comes from a N(0,10) distribution.  How would this affect the slope? the intercept?

3. Suppose a pdf is right-skewed.  Sketch the qq-plot that would result from taking a large random sample from this pdf and using the qqnorm(x) function.  (Remember, if the pdf were a normal distribution, the resulting qq plot would be a straight line.)
Hint: you can generate a right-skewed distribution using the command rchisq(n,d) where n is the sample size you want to generate, and d is an integer greater than or equal to 1.  The smaller the value of d, the more skewed the distribution will be.  However, BEFORE you do this, you should reason your way through, and write your reasoning down as part of your solution.  It's good to understand this first, then check your answer.

4. a) Suppose we wish to calculate a 90% confidence interval for the mean height of men in the US. It is known (from previous research) that the heights are normally distributed and the SD = 3 inches.  Our sample size is 10.  What is the margin of error?
b) Suppose we wish to calculate a 90% confidence interval for the mean height of men in the US.  It is known that heights follow a normal distribution with SD = 3inches. We wish the margin of error to be 1 inch.  What sample size do we need?  Suppose we wanted the margin of error to be 2 inches.  Now what sample size?

5. Simulation Study.
This is a simulation study of the experiment described in Exercise 4.  We're going to take a random sample of 10 "people" from a N(69,3) population (mean 69 inches, SD 3 inches) and calculate a confidence interval to estimate the mean of the population.  Now of course, in this simulation, we KNOW that the mean is 69.  But let's pretend that we don't know this, and all we get to see are the 10 data that we drew.

x <- rnorm(10,69,3) will draw a sample.
Now we want to calculate a 95% confidence interval for the mean.  The commands to do this are:
lb <- mean(x) - (3/sqrt(10))*1.96
ub <- mean(x) + (3/sqrt(10))*1.96

The first command gives the lower bound, the second the upperbound.

a) Do this.  Does your confidence interval contain the true mean (69)?
b) Repeat this 1000 times. We'll count the number of times the true mean was below the interval, and how many times it was above.
ci <- function(N){
lb <- c()
ub <- c()
for (i in 1:N){
xbar <- mean(rnorm(10,69,3)
merror <- (3/sqrt(10))*1.96
lb <- c(lb, xbar - merror)
ub <- c(ub,xbar +merror)
list(lower = lb, upper = ub)}}

This is a function that computes a confidence intervals for each of  N random samples from a N(69,3) population.  to use it, type
output <- ci(1000)
which generates a list of lower bounds (output$lower) and upperbounds (output$upper).
So the first confidence interval is (output$lower[1], output$lower[2]_

c) Run the ci function with N = 1000.  Does the first interval contain the true value of the mean?  Does the second interval?
d) You can count the number of times the intervals were too high as follows:
sum(output$lower > 69)
You can count the number of times the intervals were too low by:
sum(output$upper < 69)
How many times did the intervals miss?  How many times should we have expected them to miss?