Homework 6
Due Friday May 10
1.
a) Use R to find the 10th and 90th quantiles of a standard normal distribution.
b) Find the 10th and 90th quantiles of a uniform distribution on the interval
(0,1).
NOTE: for (a) you can use the table in A.3 to find the value of z such that
P(Z < z) = .10. Or you can use the qnorm(p,mean,sd) command. For
the uniform, use the qunif(p) command.
2. The qqnorm(x) function compares the sample quantiles of x to the theoretical
quantiles of a N(0,1) distribution. If x comes from a N(0,1) distribution,
then the resulting plot is roughly a straight line with intercept 0 and slope
1.
a) Suppose x comes from a N(10,1) distribution. How would this affect
the slope? the intercept?
b) Suppose x comes from a N(0,10) distribution. How would this affect
the slope? the intercept?
3. Suppose a pdf is right-skewed. Sketch the qq-plot that would result
from taking a large random sample from this pdf and using the qqnorm(x) function.
(Remember, if the pdf were a normal distribution, the resulting qq plot
would be a straight line.)
Hint: you can generate a right-skewed distribution using the command rchisq(n,d)
where n is the sample size you want to generate, and d is an integer greater
than or equal to 1. The smaller the value of d, the more skewed the
distribution will be. However, BEFORE you do this, you should reason
your way through, and write your reasoning down as part of your solution.
It's good to understand this first, then check your answer.
4. a) Suppose we wish to calculate a 90% confidence interval for the mean
height of men in the US. It is known (from previous research) that the heights
are normally distributed and the SD = 3 inches. Our sample size is 10.
What is the margin of error?
b) Suppose we wish to calculate a 90% confidence interval for the mean height
of men in the US. It is known that heights follow a normal distribution
with SD = 3inches. We wish the margin of error to be 1 inch. What sample
size do we need? Suppose we wanted the margin of error to be 2 inches.
Now what sample size?
5. Simulation Study.
This is a simulation study of the experiment described in Exercise 4. We're
going to take a random sample of 10 "people" from a N(69,3) population (mean
69 inches, SD 3 inches) and calculate a confidence interval to estimate the
mean of the population. Now of course, in this simulation, we KNOW that
the mean is 69. But let's pretend that we don't know this, and all
we get to see are the 10 data that we drew.
x <- rnorm(10,69,3) will draw a sample.
Now we want to calculate a 95% confidence interval for the mean. The
commands to do this are:
lb <- mean(x) - (3/sqrt(10))*1.96
ub <- mean(x) + (3/sqrt(10))*1.96
The first command gives the lower bound, the second the upperbound.
a) Do this. Does your confidence interval contain the true mean (69)?
b) Repeat this 1000 times. We'll count the number of times the true mean
was below the interval, and how many times it was above.
ci <- function(N){
lb <- c()
ub <- c()
for (i in 1:N){
xbar <- mean(rnorm(10,69,3)
merror <- (3/sqrt(10))*1.96
lb <- c(lb, xbar - merror)
ub <- c(ub,xbar +merror)
list(lower = lb, upper = ub)}}
This is a function that computes a confidence intervals for each of
N random samples from a N(69,3) population. to use it, type
output <- ci(1000)
which generates a list of lower bounds (output$lower) and upperbounds (output$upper).
So the first confidence interval is (output$lower[1], output$lower[2]_
c) Run the ci function with N = 1000. Does the first interval contain
the true value of the mean? Does the second interval?
d) You can count the number of times the intervals were too high as follows:
sum(output$lower > 69)
You can count the number of times the intervals were too low by:
sum(output$upper < 69)
How many times did the intervals miss? How many times should we have
expected them to miss?