HW 7 Solutions
P. 315 #12
I'm going to use X to represent group 1 and Y to represent group 2 because
the notation is easier to follow on the web.
a) So Var(Xbar - Ybar) = Var(Xbar) + Var(Ybar) because they are indepdent
= sigma2X/n1 + sigma2Y/n2
So SD is the square-root of this: sqrt(1.82^2/53 + 1.53^2/60) = .3186
b) (7.9-4.3) +/- 2 *.3186
3.6 +/- 2.9628
(2.96, 4.24)
c) This suggests that the "true" difference between the population means
is somewhere in the range of 2.96 to 4.24, which suggests that the mean of
the first group is greater than the mean of the second group. So we can pretty
confidently conclude that people remember more brands in ads with sexual content.
Extra* (but required): in class we took a random sample of
7 serial numbers from the population {1,2,....N}, where N was an unknown
number. Each serial number came from a "captured tank".
a) Make a sketch of the pdf of the population (obviously in terms of N)
It is a uniform distribution: there is a "point mass" at 1/N above
each of the points 1,2, ...., N
b) Let Xi represent the serial number on the ith tank we capture.
Find an expression for the expected value of Xi. Find an expression for the
standard deviation of Xi.
E(Xi) = sum (i * (1/N)) = (1/N) sum i = (N+1)/2 (This last
step is as simple as you can get it, but it's okay if you stop the step before.
Var(Xi) = sum (i - (N+1)/2)2 (1/N) and this can be simplified
a litle more, but that's not necessary. Of course you need to take the
square rot to get the SD, and I can't do this on the web. Let's call
this number "sigma".
c) Suppose we calculate Y = (X1 + ... + X7)/7 Find the mean
of Y in terms of the mean of X. Find the SD of Y in terms of the SD
of X.
E(Y) = (E(X1) + ... + E(X7))/7 = 7*E(X)/7 = (N+1)/2
Var(Y) = sigma/sqrt(7)
d) A popular choice for an estimator for N was Xbar + 3* SD(X).
What's the bias of this estimator? How does the bias change if we take
a larger sample size?
Bias = E(Xbar + 3SD(X)) - N
E(Xbar) + 3sigma - N
(N+1)/2 + 3sigma - N
notice that n -- the sample size -- does not appear (and doesn't appear
in sigma). So changing the sample size won't affect the bias.
e) Another choice was Xbar + 3*SD(X)/sqrt(n) . What's the bias
of this estimator? How does the bias change if we take a larger sample
size?
Bias = E(Xbar + 3SD(X)/sqrt(n))- N = (N+1)/2 + 3sigma/sqrt(n) - N
Now n does appear, and as n gets very large, the middle term gets very small.
Note: Strictly speaking, the Xi's are not independent. Why? Because
the population is finite (it has N tanks) and so every time we draw one out
without replacement, we gain information about the population and therefore
the probability that the next Xi will have a certain value is different than
it was before we knew the previous tank's value. On the other hand,
if the population is really big compared to the sample size, then this doesn't
matter so much. Hopefully you saw this point, but it's okay if you
didn't write it out in this problem.