Homework 8
Due Friday, May 31, Also see
HW9
, due same day.
Note: a useful command for this homework assignment will be the t.test
command. Type help(t.test) to learn about it.
0. p. 332: 6, 8 (state the assumptions you must make), 13, 14
1. Some people claim that if you spin a coin on a smooth surface, and wait
for it to fall, that the probability that it will land Heads is NOT 50%.
Suppose we spin a coin n times and count the number of heads. Let
X represent this number, and let p represent the probability that it will
land heads.
a) Let Y = X/n. Show that Y is an unbiased estimator of p.
b) Take a quarter and spin it 30 times. What is your estimate for
p? What is an approximate 95% confidence interval for p?
c) Perform a hypothesis test to test whether p=.50 against p > .50
i) What is the formula for the test statistic you will
use? What is its distribution?
ii) What value does the test statistic take for your
data (from part b).
iii) What is the p-value?
iv) Using a significance level of 5%, would you reject
the null hypothesis? How about 10%?
2. Download the
bloodpressure data set (bloodpressure.dat)
. (Remember, once the file is in your working directory, type bp
<- read.table("bloodpressure.dat", header=T) to create an object called
bp. Then type attach(bp) to use the variable names, and names(bp) to
see what these names are.
This is data from an actual study of the effectiveness of the drug Capitrol.
The claim is that this drug lowers blood pressure (a good thing.)
The data consist of systolic and diastolic blood pressures measured
on 15 patients before taking the drug, (sbefore and dbefore) and then again
15 minutes after taking the drug (safter, dafter). There are two other
variables: sdiff and ddiff, that measure the difference in blood pressure
(safter - sbefore, for example.) Thus, a negative value for sdif is
a good thing; it means that systolic pressure went down.
Let's examine systolic blood pressure.
a) Make side-by-side boxplots of the systolic blood pressure before and
after. What does it look like the effect of the drug is? (boxplot(sbefore,
safter))
b) Make a histogram of the change in blood pressure, sdiff. Comment
on the shape of the distribution.
c) Do you think the distribution of sdiff could be modelled with a normal
distribution?
d) Perform a hypothesis to test whether the data suggests the drug is effective.
i) State the null and alternative hypotheses.
ii) Give the formula for the test statistic you will
use. What is its distribution? What assumptions must you make?
iii) What is the value of the test statistic for these
data?
iv) What is the p-value?
v) Choose a significance level. What do you conclude?
3. A commonly used non-parametric test for data like those in question
2 is called the sign test. Non-parametric tests are used when you
are unsure of the distribution of the population or are not willing to make
any assumptions about what the distribution might be. Their drawback
is that they tend to be conservative, which means that they are less likely
to reject the null hypothesis when it is in fact false (and therefore should
be rejected.) This is how the test works: Suppose the drug was
ineffective. Then differences in blood pressure depend solely upon
measurement error (which can be substantial for blood pressure) and medically
unimportant changes in the patient (such as increased or decreased anxiety).
Thus, some patients will have slightly higher bloodpressure, some slightly
lower. And the probability of a patient's blood pressure change being
negative is 50%. In other words, its just like flipping a coin. "heads"
means their blood pressure goes down, "tails" means it goes up.
Our null hypothesis is, therefore, that p = .50, where p is the probability
that a patient's blood pressure will go down after taking the drug.
Answer these questions using the blood pressure data set from the last
question.
a) State the alternative hypothesis.
b) Let X represent the number of patients for whom blood pressure goes
down. What is the distribution of X?
c) For how many patients did blood pressure go down?
d) What is the pvalue? That is, if x is the number of patients for
whom blood pressure went down, what is P(X > x)?
e) Would you reject the null hypothesis?
4. Notice that there is no control group in this study of Captopril. How
does this affect your conclusion in problems 2 and 3? Explain.
5. Consider, once again,
the risk data
.
a) Is there evidence that men and women differ in their perception of the
risk level associated with using household appliances?
Hints: to create the data for men:
mrisk <- appliances[gender=="male"]
frisk <- appliances[gender=="female"]
b) What assumptiosn did you have to make? Do you think they're true?
Explain.