sample solutions

Caveat Emptor!  Don't assume there are no mistakes in these solutions!
(And let me know if you find any.)


1. Stock returns

a) The distribution should look like a normal curve, centered at 7.5% with a 22% SD.
b) 4.01
c) Median is -3.5.  Q1 = -8.4, Q3 = 31.7, so IQR = 31.7- (-8.4) = 40.1
d) Let X represent the return from a single stock. P(X < 0) = P(Z < (0 - 7.5)/22) = P(Z < -.341) = 0.3669
f) Averages based on  a random sample of size 10 will be normally distributed with mean 7.5 and SE 22/sqrt(10) = 6.9570
g) P(Xbar < 0) = P(Z < -7.5/6.9570) = P(Z < -1.08) = 0.1401
h) This portfolio has an average of 4.01%, which is (4.01 - 7.5)/6.9570 = -0.501653
i) Xbar +- z * sigma/sqrt(n)
Xbar +- z * 6.9570
We use z because the standard deviation is known. This is an exact confidence interval because the observations come from a normal distribution.
The confidence level is 90%, which means we find that point that has (100-90)/2 = 5% above it on a standard normal distribution.
So z = 1.64 (you might also choose 1.65,) from Table A.  Or, use the last line (df = infinity) of Table D for a more precise answer (1.645).  Just be sure to make clear on the exam which you are using.  I'm going to use the more conservative 1.65 here.
4.01 +- 1.65*6.9570
4.01 +- 11.47905
(-7.46905, 15.48905) is a 90% Confidence Interval for the mean stock return of DC&H's portfolios.
j) If we were to pick 10 stocks at random, then the average return would be about 7.5 %, give or take.
 -- Let's use a 5% significance level.
-- Null Hypothesis:  mean of population equals 7.5%
-- Alt. Hypothesis: mean of population is bigger than 7.5%.
--The population fo stock returns is normally distributed, and the  SD of the population is known to be 22%, so we can use a z-test.
Z = (Xbar - 7.5)/(22/sqrt(10)
  = (Xbar - 7.5)/6.9570
So we observe a test statistic of (4.01 - 7.5)/6.9570 = -0.501653.
--The pvalue for this is P(Z > -0.50) = 1 - .3085 = .6915 or 69.15%.
-- We reject the null hypothesis if the pvalue is less than 5%, which is obviously not the case here.  So we conclude that there is insufficient eveidence that the portfolio selected by DC&H is any different than what we could get by selecting 10 at random.

2. Magnets

a)  This is not a valid interpretation of a confidence interval.  We know that roughly 68% of the patients are within 1SD of average, 95% within 2, etc.
So about 95% are with 2SDs means 95% are within 100 points of -25 or between -125 and 75.  68% are within 1 SD, so this is -50 to 25.  Neither of these intervals describes where 90% of the patients fell, but we can see that both of them are much wider than the confidence interval.  (The reason is that the confidence interval is an estimate of the mean, but the question here was about individual observations.)
b) Before plowing in and doing a confidence interval or hypothesis test, let's look at the design itself.  This is quite a sloppy study.  The subjects were recruited from magazines, which presumably means that there might be a response bias since people interested in using magnets for pain are probably more inclinded to believe that they work.  Not only is pain very subjective, but it varies quite a bit depending on the context or emotions you're feeling at the time.  So the "pain level" measured here should be scrutinized carefully so that we believe that changes in pain level correspond to actual, physical changes in pain.  Note there's no control group, so we have no idea of knowing how pain levels would have changed if no treatment had been given.  Because there's no control group, there's also no blind or double blind on the study.  Patients will know whether or not they are wearing magnets and will be taking this information into account when responding to the questionnaire.

So whether we do a CI or a hypothesis test, we should be very careful of the findings. In fact, this study is so poorly designed, that we probably shouldn't believe anything that comes out of it!

But, in the interests of education, let's do a confidence interval and hypothesis test anyways.
 

Let's do both a confidence interval and a hypothesis test.

    A 95% confidence interval is Xbar +- t(99 )* s/(sqrt(100)
We use a t distribution here because the SD of the population is unknown, and is being estimated with the SD of the sample (s = 50).
You don't have a line on your table for 99 df, but its close to the values for 100 df.  So we'll use 1.984. (From the computer, the value for 100 df is actually
1.983972, while the value for df = 99 is 1.984217.  So we're not losing much by using 1.984.

-25 +- 1.984*(50/10)
(-34.92, -15.08) is a 95% confidence interval.
This means that it is plausible that the mean is negative, which means that the change was "real".  But remember that we're not going to believe anything we conclude from this.

How about a hypothesis test?  Let's start with a significance level of 5%.
H_0: Mean change is 0
H_a: Mean change is less than 0

T = (Xbar - 0)/(50/sqrt(100))
Observed value is -25/5 = -5.
Look up the df=100 line (should be df=99 but using df=100 is a good enough approximation) and you'll see that the p-value is off-the chart.  But from the trend we can conclude that its well below .05% (0.0005).  So we're forced to conclude that the effect is statistically significant!

But this should serve as a good lesson that you should never believe something is true just because it is "statistically significant."  You need to examine how the data were collected, too.
 

c) A good study would select a number of people without telling them that magnets were being tested.  They would then be randomly divided into a control group, which will wear something that looks and feels like a magnet but is not, and the treatment group which will wear magnets. Neither the subjects nor the investigators will know, until the end of the study, who wore which.  There's really no getting around the fact that we have to have some way of measuring pain, and this will be difficult, but hopefully we'll find something sound that many other legitimate researchers would use as well.  (Easier said than done.)  A study like this wouldn't be perfect, but would be an improvement.