A random sample of 10 counties was selected for your final. Here is some information on them from the 1997 CIA World Factbook.

Country

Total Fertility Rate

(i.e. Average number of Children born per woman)

Percentage Change in Gross Domestic Product, 1998 -1999 (estimated)

Female Life Expectancy in years (i.e. median age at death)

Afghanistan

6.07

-0.5%

46

Cambodia

5.81

-1.5%

52

Costa Rica

2.85

-2.1%

78

Indonesia

2.66

-2.8%

64

Italy

1.16

+3.1%

82

Jordan

4.94

-0.4%

70

Nigeria

6.17

+1.0%

56

Russia

1.35

-1.4%

60

United Arab Emirates

3.62

+1.2%

73

United States

2.06

+4.0%

79

AVERAGE

3.669

0.06%

66

STAND. DEV.

1.8486

2.1186%

12

1. What is the correlation between Female Life Expectancy and fertility?

r = -.69

2. What is the regression equation for predicting Female Life Expectancy from fertility?

life expectancy = -4.4838(fertility) + 82.451

3. Albania's fertility rate is 2.77, what is the predicted Female Life expectancy for Albania?

70.03 = -4.4838(2.77) + 82.451 (about 70 years)

4. What is the median Percentage Change in Gross Domestic Product?

-.45%

5. Economists use Percentage Change in Gross Domestic Product as a predictor of increased prosperity (positive change) or recession (negative change or no change). The average for the 10 countries above is a +0.06% change so some economists are using this as a signal of "good times" ahead for the global economy, others are arguing that there will be recession next year. The long-run expected change is 0% (no change).

Please test the hypothesis that there will be good times ahead using information from the sample of 10 countries above. Formulate the null and the alternative, and compute a test statistic. Give me a conclusion in plain English. 

Null: average = 0%

Alternative: average > 0%

use a t-test, this is a sample of 10 where the population standard deviation is unknown.

SD+ is 2.2332

should get a t = .085 with 9 degrees of freedom

No evidence for good times ahead. Do not reject the null.

 

You are in the home of your future mother-in-law and she offers you M & Ms (a type of candy) from an infinitely large bag of M & Ms. Unknown to you (but known to your professor) she is testing you. Green M & M's are her favorites. For every green one you pick you lose two points. She hates Brown M & Ms and you get a point for every one of those you choose. She does not feel strongly about the other colors so you neither lose nor gain points if you pick them.

This is the distribution of M & Ms in her bag.

Color

Brown

Red

Yellow

Green

Orange

Percent

20%

20%

20%

30%

10%

You shove your hand into the bag and grab 25 M & Ms. Treat this like a random sample.

 

6. What is the expected point average for the 25 M & Ms you grabbed?

-0.4 is the average for this "box" (her bag)

7. What is the standard deviation of her bag of M & Ms?

SD = 1.1136

8. What is the standard error for the average of a sample of 25 M & Ms?

.2227

9. To get her blessings for your future marriage, you must have an average of least 0point after grabbing your 25 M & Ms. If you have an average of more than 0.5 points, she will pay for your honeymoon, but she gets to choose where the two of you go (and she gets to go along...). What is the chance that you will get blessed by your mother-in-law -to-be but manage to keep her out of your honeymoon plans?

The Z score for 0.5 points is about 4.05 and the Z score for 0 points is about 1.80. The area (chance) between them is 3.6%

 

 

The following comes from the Monday, October 12, 1998 edition of The Daily Bruin:

Decline in voting rate for youths worries groups

By Catherine Turner

A determined Holly Hogan stood in front of Rieber Hall Monday night, attempting to get as many students as possible to register to vote before the deadline that evening....(many lines deleted)

'Maybe the reason that stats say that the youth doesn't vote is because we have moved away to college and all have a change of address,' Hogan said.(many lines deleted)

Polls collected by the Field Institute during the 1988 presidential election showed that 20 percent of 18 to 29 year olds voted. In 1992, only 18 percent voted.(many lines deleted)

Assume that the Field Institute Poll surveyed 1,225 persons 18 to 29 in 1992. Also assume that the figure of 20% is the stable historical percentage between 1968 and 1988 for all persons age 18 to 29 who voted in those years.

10. Test the hypothesis that the percentage of 18 to 29 year olds who voted declined between 1988 and 1992. State the null and the alternative, perform a test, a state a p-value. Please use a 5% level of significance as your decision rule. Do you believe that there has been a decline in the voting rate between 1988 and 1992?

Null: Voting percentage = 20%

Alternative: percentage < 20%

Standard deviation is SQRT(.2 * .8) (must have this right for full credit)

SE is 1.1429%

Resulting Z score is -1.75.

The p-value is 4%

Reject the null, there is evidence of a decline.

11. Congratulations, you've gotten a job as a pollster and your first assignment is to ask 1,225 18-to-29 year olds whether they voted in the most recent election. Let's assume that the 20% figure from 1988 is still the stable long-run voting percentage. What is the chance that your sample of 1,225 will have 17% or fewer of your respondents saying they voted?

Use the same SE as in problem 10. The resulting Z should be -2.625 which translates

into about .4%.

12. Amstar Corporation sued Domino's Pizza, Inc. claiming that Domino's use of the name "Domino" on its pizza infringed on the Domino Sugar trademark (owned by Amstar Corp.). Both sides presented survey evidence on whether Domino's Pizza, Inc.'s use of the name "Domino" tended to create confusion among consumers.

Amstar surveyed 525 people in 10 cities in the eastern United States (two of the cities had Domino's Pizza outlets). The persons interviewed were women reached at home during the day who identified themselves as the household member responsible for grocery buying. Shown a Domino's Pizza box, 44.2% of those interviewed indicated their belief that the company that made the pizza made other products. 72% of that group (31.6% of all respondents) believed that the pizza company also make sugar.

Assume that a multistage cluster sampling method was used to draw the sample and that for our purposes, it produced a random sample of 525 people.

Now that you have taken Statistics 10, do you think this is a good study? Answer yes or no and please justify your response. Please be brief, a few sentences will be sufficient.

No calculations are required to answer this question.

 

ANSWER: No. This was taken from an actual case and it's an example of a bad study. Looking for two basic ideas: 1. selection bias (somehow only women who were at home during the day got into their sample. There is nothing wrong with the sampling method, this is pretty standard for a multistage cluster, but there seems to be something odd going on with the interview process.) 2. response bias -- somehow, women who didn't have Domino's Pizza outlets in their cities (only 2 of the 10 cities had pizza outlets) were able to express beliefs about the pizza company.

13. A survey is carried out by non-profit organization in Los Angeles to learn more about children's lives in Los Angeles (children are defined as people under age 18). They drew a simple random sample (SRS) of 1000 households. But after several phone calls, the interviewers find that only 710 of the sample households have children. Rather than face such a high non-response rate, the non-profit organization drew a second batch of households at random. They used the first 290 of them with children to bring the first sample up to 1000 households. They counted 2,807 children living in these 1000 households (the 710+290 households) and estimate the average number of children per household in Los Angeles to be about 2.81 or 2.9 kids.

Is this average like to be (circle one):

(i) too high

(ii) too low

(iii) about right

Explain your choice. Be brief. No calculations are required to answer this question for full credit, but you can include them if you wish.

The average is too high. 1. By drawing a second batch and using the first 290 WITH CHILDREN, they have effectively contaminated what was probably a reasonable sample. There are families without children and they are simply ignoring them in their calculation of an average. 2. Those families which made it into the interview are probably larger families; there is a better chance of catching a larger family at home other things being equal. Another way to think about this is that it is tough to catch a person who lives at home alone at home, it is much easier to catch someone at home if there is more than one person living in that house. The result is the average will be too high.

 

One year, there were about 3,000 institutions of higher learning in the U.S. (including junior colleges and community colleges). As part of a continuing study of higher education, the government took a simple random sample (SRS) of 400 of these institutions. The average enrollment in the 400 sample schools was 3,700 and the SD was 6,500. The government estimates the average enrollment at all 3,000 institutions to be around 3,700; they put a give or take number of 325 around this estimate. Say whether each of the following statements is true or false, and explain. If you need more information to decide, tell me what you need and why you need it.

 

14. An approximate 95% confidence interval for the average enrollment of all 3000 institutions runs from 3,050 to 4,350.

TRUE. The SE is 325 so, the sample average + or - (2 * 325) is exactly the confidence interval given.

15. About 68% of the schools in the sample had enrollments in the range 3,700 ± 6,500.

FALSE. You are being asked about the schools in the sample here. The SD of 6,500 is for the sample. You can't make this statement unless you know that the schools in the sample are normally distributed around 3,700

16. I guess you get 10 points just for attending. J Have a happy winter break. Best wishes for the future.