Department of Psychology

Alliant International University -
Los Angeles

FOR 6530
Statistical Design & Research

Spring Semester 2003

Name: ........................

Student ID#: ......................

Section/TA: ......................

Score: .................. / 100

Take-Home Final Exam, due Saturday, April 26, 2003, 9:00 AM

Instructor: Prof. Ivo Dinov

http://www.stat.ucla.edu/~dinov/

Instructions: Please write clearly and diligently. You may use the reverse pages if you need extra space, provided you make a comment on the front page. If this is still not enough please attach extra paper. What you write must be your own work. You can discuss on a conceptual level test problems with peers, however, this should not evolve into solutions, printouts, emails, hand notes or other explicit forms for solution exchanges. This test a total of 14 problems. You should calculate by hand statistics that use less than or equal to 5 observations. Please use the online SOCR resource for all necessary calculations and for validations
http://www.stat.ucla.edu/~dinov/courses_students.dir/Applets.dir/OnlineResources.html

1. We wish to compare the marriage counselors from two different clinics with regard to how often couples they have counseled seek a divorce. As many as possible of the couples counseled by each of the counselors are located. The following are the percentages for each counselor of client couples who obtained, or were seeking, a divorce one year after termination of counseling:

University Clinic	62	32	21
Mental Health Associates	43	32	27	35

2. Suppose the gasoline mileages of compact cars for LA driving are normally distributed with mean of 27.1 mpg and standard deviation of 2.3 mpg. What proportion of drivers have a less than 23 mpg? Less than 30 mpg? Between 25 and 30 mpg? What is the mileage that the bottom 10% of the most economical vehicles get? Draw proper graphs/regions-of-interest for all cases.

3. For each of the following sets of scores, find the mode, the median, and the mean. Draw box-and-whisker plots. Report the 5-number summaries. Would normal assumptions be valid for any of these data? Would these be considered equal-variance samples? Why is this important? Do the data histograms. Are these data skewed?

Data 1

Data 2

-2

100

Data 3

-1

4. Suppose I selected a random sample of 15 out of the 129 IQ scores of a group of juvenile delinquencies at one juvenile correction facility in LA. These scores are listed below are the only information you are given:

100

We need to use the sample data to construct a 95% confidence interval for the mean IQ score for this specific population. State the parameter of interest (using a symbol and in words). State the estimate of this parameter (using a symbol, in words and as a number). Calculate SE(parameter). Explain what it means. Clearly interpret your results. Does the confidence interval contain the true mean midterm score for the class? Why?

5. True or False. Explain!

A highly significant test result means that the size of the difference between the estimated value of the parameter and the hypothesized value of the parameter is significant in a practical sense.

A P-value of less than 0.01 is often referred to as a weak evidence against the null hypothesis.

A non significant test result does not necessarily mean H_o is true.

A two-tail test of H_o: μ = μ_o is significant at the 5% significance level if and only if μ_o lies outside a 95% confidence interval for μ.

6. A quality control engineer at the Educational Testing Service (ETS) monitors the automatic exam grading for over 12 million multiple-choice "scantron" papers. Each month, to assess fairness and consistency in grading the exams, the engineer takes a random sample of papers, which are then manually graded and the results are compared to the scores obtained by the scanners. For the 12 months of 2002 the following discrepancies between the manual an automated grading methods were observed:

Suppose the tolerable expected error rate is 1% and the actual number of mannually graded papers in the i^th month is i*100, 1 ≤ i ≤ 12. Is there any evidence that the observed discrepancies are purely just due to chance? Or is there a factor whose presence alters the amount of discrepancies? What are the observed and expected discrepancies? Show your work and validate is by using the SOCR resource. Interpret the results.

7. Fill in the blanks:

Testing at the 5% level of significance means that the ___________ hypothesis is rejected whenever a P-value __________ than 5% is obtained.

In general, Central Limit Effects can be detected for sample sizes that are ________ than ____(5, 10, 30, 50, 150)__.

For large sample sizes, the shape of the distribution of the sample mean is always __________ with parameters ______ and ______ regardless of the shape of the distribution of the random variable X.

8. Identify which statements are correct/incorrect. Explain!

In the calculation of the value of the correlation, R, it does not matter which one of the variables is designated as X and which one is designated as Y.

If the sample correlation coefficient equals 1, then there is a perfect linear association between the two variables for these observations.

When sampling, taking a small sample guarantees an accurate estimate of the parameter of interest.

9. We are screening drugs for possible use against cancer. We implant 11 laboratory mice with cancer cells. Five of them selected at random are treated with Theron-P. Two months later, detectable tumors are removed from all 11 animals and weighed. The following are the results (in grams):

Theron-P Group:	1.1	1.5	1.6	1.3	0.9
Control Group:	2.0	1.8	1.4	2.2	1.6	1.6

Set up your hypotheses. Perform the necessary test. Draw final conclusions.

10. You wish to see if consumers differ in their preference for three different fast-food restaurants. You conduct a survey with 60 participants (discuss how you may want to do this!) and obtain the following frequencies for preference: restaurant A = 30, restaurant B = 18, and restaurant C = 12. Make your inference given this information. What are your conclusions?

11. Six chronic hospitalized schizophrenics are randomly assigned to three therapy programs (this, of course, is a too small sample for a practical study). At the end of 3 months each is rated on the normality of his or her behavior in a standardized interview situation. The results are shown below. Express your hypotheses in symbolic form. What is an appropriate statistical tests for these data? Calculate your statistics and p-values. Interpret the results.

Behavior Ratings of Six Schizophrenics in Three
Difference Therapy Programs

Program L	3	7
Program M	4	10
Program N	6	12

12. Your casual observations suggest that sons, upon maturity, are in general taller than their fathers. A friend, on the other hand, argues that the opposite is true. In order to check this out, you measure the heights of a sample of nine father-son pairs. The following are the results (rounded to the nearest inch):

Pair Index	1	2	3	4	5	6	7	8	9
Son	73	68	66	70	74	68	65	72	69
Father	71	69	63	70	72	69	63	68	70

Apply three non-parametric tests (show work by hand for at least one of these) to test a reasonable hypothesis to address this question.

13. You are at a banquet for 300 people. Door prizes are to be awarded as follows: one TV, three turkeys, and six symphonic records . Ten of the entrance tickets are randomly selected for these prizes. What is the probability that a particular guest will win grand prize: the TV? A symphonic record? A door prize of some kind?

14. The results below represent measures of sociability and peer ratings of sociability, for 10 college students.

Sociability Scale	30	34	35	36	39	39	40	40	42	46
Peer Ratings of Sociability	72	70	76	80	73	79	76	83	85	81

Are these measures correlated. What is the best linear model for the relation between these two measures? Suppose Alex Sample is a student in this College and his sociability score is 32. What would you expect for his peer rating of sociability to be? Is this atypical score? Discuss and interpret all of your findings.