Homework 8

Due Friday, May 31, Also see HW9 , due same day.


Note: a useful command for this homework assignment will be the t.test command.  Type help(t.test) to learn about it.

0. p. 332: 6, 8 (state the assumptions you must make), 13, 14

1. Some people claim that if you spin a coin on a smooth surface, and wait for it to fall, that the probability that it will land Heads is NOT 50%.  Suppose we spin a coin n times and count the number of heads.  Let X represent this number, and let p represent the probability that it will land heads.

a) Let Y = X/n.  Show that Y is an unbiased estimator of p.
b) Take a quarter and spin it 30 times.  What is your estimate for p?  What is an approximate 95% confidence interval for p?
c) Perform a hypothesis test to test whether p=.50 against p > .50
    i) What is the formula for the test statistic you will use?  What is its distribution?
    ii) What value does the test statistic take for your data (from part b).
    iii) What is the p-value?
    iv) Using a significance level of 5%, would you reject the null hypothesis?  How about 10%?

2. Download the bloodpressure data set (bloodpressure.dat) .  (Remember, once the file is in your working directory, type bp <- read.table("bloodpressure.dat", header=T) to create an object called bp.  Then type attach(bp) to use the variable names, and names(bp) to see what these names are.

This is data from an actual study of the effectiveness of the drug Capitrol.  The claim is that this drug lowers blood pressure (a good thing.)  The data consist of systolic and diastolic blood pressures measured on 15 patients before taking the drug, (sbefore and dbefore) and then again 15 minutes after taking the drug (safter, dafter).  There are two other variables: sdiff and ddiff, that measure the difference in blood pressure (safter - sbefore, for example.)  Thus, a negative value for sdif is a good thing; it means that systolic pressure went down.

Let's examine systolic blood pressure.

a) Make side-by-side boxplots of the systolic blood pressure before and after.  What does it look like the effect of the drug is?  (boxplot(sbefore, safter))
b) Make a histogram of the change in blood pressure, sdiff.  Comment on the shape of the distribution.  
c) Do you think the distribution of sdiff could be modelled with a normal distribution?
d) Perform a hypothesis to test whether the data suggests the drug is effective.
    i) State the null and alternative hypotheses.
    ii) Give the formula for the test statistic you will use.  What is its distribution?  What assumptions must you make?
    iii) What is the value of the test statistic for these data?
    iv) What is the p-value?
    v) Choose a significance level.  What do you conclude?

3. A commonly used non-parametric test for data like those in question 2 is called the sign test.  Non-parametric tests are used when you are unsure of the distribution of the population or are not willing to make any assumptions about what the distribution might be.  Their drawback is that they tend to be conservative, which means that they are less likely to reject the null hypothesis when it is in fact false (and therefore should be rejected.)  This is how the test works:  Suppose the drug was ineffective.  Then differences in blood pressure depend solely upon measurement error (which can be substantial for blood pressure) and medically unimportant changes in the patient (such as increased or decreased anxiety).  Thus, some patients will have slightly higher bloodpressure, some slightly lower.  And the probability of a patient's blood pressure change being negative is 50%.  In other words, its just like flipping a coin.  "heads" means their blood pressure goes down, "tails" means it goes up.

Our null hypothesis is, therefore, that p = .50,  where p is the probability that a patient's blood pressure will go down after taking the drug.
Answer these questions using the blood pressure data set from the last question.
a) State the alternative hypothesis.
b) Let X represent the number of patients for whom blood pressure goes down.  What is the distribution of X?
c) For how many patients did blood pressure go down?
d) What is the pvalue?  That is, if x is the number of patients for whom blood pressure went down, what is P(X > x)?
e) Would you reject the null hypothesis?

4. Notice that there is no control group in this study of Captopril.  How does this affect your conclusion in problems 2 and 3? Explain.

5. Consider, once again, the risk data .  
a) Is there evidence that men and women differ in their perception of the risk level associated with using household appliances?  
Hints: to create the data for men:
mrisk <- appliances[gender=="male"]
frisk <- appliances[gender=="female"]

b) What assumptiosn did you have to make?  Do you think they're true? Explain.