HOMEWORK #8 ANSWERS

Chapter 26

2. The data are like 3800 draws made at random with replacement from a box | ?? 0's ?? 1's |, with 1 = red.

(a) Null: The fraction of 1's in the box is 18/38.

Alt: The fraction of 1's in the box is more than 18/38.

(b) The expected number of reds (computed using the null) is 1800. The SD of the box (also computed using the null) is nearly 0.5, so the SE for the number of reds is x 0.5 » 31. So z = (obs-exp)/SE = (1890-1800)/31 » 2.9, and P » 2/1000.

(c) Yes.

Comments. (i) This problem is about the number of reds. In the formula for the z-statistic, obs, exp, and SE all refer to the number of reds. The expected, as always, is computed from the null. In this problem, the null gives the composition of the box, so the SD is computed from the null; it is not estimated from the data (p.487).

(ii) This problem, and several others below, can be done using one-sided or two-sided tests. The distinction does not matter here; it is discussed in chapter 29.

5. The box has one ticket for each freshman at the university, showing how many hours per week that student spends at parties. So there are about 3000 tickets in the box. The data are like 100 draws from the box. The null hypothesis says that the average of the box is 7.5 hours. The alternative says that the average is less than 7.5 hours. The observed value for the sample average is 6.6 hours. The SD of the box is not known, but can be estimated from the data as 9 hours. On this basis, the SE for the sample average is estimated as 0.9 hours. Then z = (obs-exp)/SE » (6.6 - 7.5)/0.9 = -1. The difference looks like chance.

8. Model: There is one ticket in the box for each person in the county, age 18 and over. The ticket shows that person's educational level. The data are like 1000 draws from the box.

Null: The average of the box is 13 years.

Alt: The average of the box isn't 13 years.

The expected value for the average of the draws is 13 years, based on the null. The SD of the box is unknown (there is no reason the spread in the county should equal the spread in the nation), but can be estimated as 5 years--the SD of the data. On this basis, the SE for the sample average is estimated as 0.16 years. The observed value for the sample average is 14 years, so

z = (obs-exp)/SE = (14 - 13)/0.16 » 6,

and P » 0. This is probably a rich, suburban county, where the educational level would be higher than average.

12. (a) There are 59 pairs, and in 52 of them, the treatment animal has a heavier cortex. On the null hypothesis, the expected number is 59 x 0.5 = 29.5 and the SE is x 0.5 » 3.84. So 52 is nearly 6 SEs above average, and the chance is close to 0. Inference: treatment made the cortex weigh more.

(b) The average is about 36 milligrams and the SD is about 31 milligrams. The SE for the average is 4 milligrams, so z = 36/4 = 9 and P » 0. (This is like the tax example in section 1.) Inference: treatment made the cortex weigh more.

(c) This blinds the person doing the dissection to the treatment status of the animal. It is a good idea, because it prevents bias; otherwise, the technician might skew the results to favor the research hypothesis.

Chapter 27

3. There are two samples, you need to make a two-sample z-test. Model: There are two boxes. The 1985 box has a ticket for each person in the population, marked 1 for those who would rate clergymen "very high or high," and 0 otherwise. The 1985 data are like 1000 draws from the 1985 box. The 1992 box is set up the same way. The null hypothesis says that the percentage of 1's in the 1985 box is the same as in the 1992 box. The alternative hypothesis says that the percentage of 1 's in the 1992 box is smaller than the percentage of 1's in the 1985 box.

The SD of the 1985 box is estimated from the data as » 0.47. On this basis, the SE for the 1985 number is x 0.47 » 15: the number of respondents in the sample who rate clergymen "very high or high" is 670, and the chance error in that number is around 15. Convert the 15 to percent, relative to 1000. The SE for the 1985 percentage is estimated as 1.5%. Similarly, the SE for the 1992 percentage is about 1.6%.

The SE for the difference is computed from the square root law (p.504) as » 2.2%

The observed difference is 54 - 67 = -13%. In the null hypothesis, the expected difference is 0%. So z = (obs - exp)/SE = -13/2.2 » -6, and P » 0. The difference is real. What the cause is, the test cannot say.

Comment. Either a one-sided or a two-sided test can be used. Here, the distinction is not so relevant: it is discussed in chapter 29.

7. (a) This is like the radiation-surgery example in section 4. (Also see review exercises 5 and 6) The SE for the treatment percent is 2.0%; for the control, 4.0%; for the difference, 4.5%. The observed difference, 1.1%, is only 0.24 of an SE. This could easily be due to chance.

(b) This is like example 4; the SE for the treatment average is 0.7 weeks; for the control average, 1.4 weeks; for the difference, 1.6 weeks. The observed difference is -7.5 weeks. So z = -7.5/1.6 » 4.7 and P » 0. The difference appears real. Income support makes the released prisoners work less.

8. The data can be summarized as

Prediction Request Request only

Predicts yes 22/46 NA

Then agrees 14/46 2/46

(a) This is like the radiation-surgery example in section 4. The two percentages are 47.8% and 4.4%. The SEs are about 7.4% and 3.0%, respectively. The difference is 43.4% and the SE for the difference is 8%. So z = 43.4/8 . 5.4 and P approximately 0. The difference is real. People overestimate their willingness to do volunteer work.

(b) The two percentages are 30.4% and 4.4%; the SEs are about 6.8% and 3.0%, respectively. The difference is 26% and the SE for the difference is 7.4%. So z = 26/7.4 = 3.5 and P is less than 2/10,000. The difference is real. Asking people to predict their behavior changes what they will do.

(c) Here, a two-sample z-test is not legitimate. There is only one sarnple, and two responses for each person in the sample. Both responses are observed, so the method of section 4 does not apply. The responses are correlated, so the method of example 3 does not apply. See p.519.

Comment. In parts (a) and (b), the number of draws is small relative to the number of tickets in the box. So there is little difference between drawing with or without replacement, and little dependence between the treatment and control averages. See pp.512, 519.

10. This test is not legitimate. The sample was not chosen at random; and there is dependence between the first-borns and second-borns.

DISC (Lab Manual)

12. -31.652