Review Materials for Final

 

Coverage: Required Chapters 1.1-1.3, 2.1-2.4, 3.1-3.4, 4.1-4.4, 5.1-5.2, 6.1-6.3

                 Optional Chapters for extra credit 10.1, 11.1

 

Suggested Extra Problems from your textbook:

 

Chapter 2.1-2.4

2.1, 2.3, 2.7, 2.21, 2.23, 2.25, 2.26, 2.33, 2.43, 2.47, 2.68

 

Chapter 10.1 (for extra credit only)

10.6, 10.11, 10.33, 10.37

 

Chapter 11 (for extra credit only)

11.1 (b-e), 11.9 (a, e)

 

Any previously assigned extra problems from the previous review sheets

 

Final Details

The final is worth 120 points spread over 7 questions.  The eighth question is 10 points of extra credit. The breakdown is approximate since I have not written the final yet:

Chapter 1.1-1.3  10 points

Chapter 2.1-2.4  20 points

Chapter 3.1-3.4  10 points

Chapter 4.1-4.2  10 points

Chapter 4.3-4.4  20 points

Chapter 5.1-5.2  20 points

Chapter 6.1         10 points

Chapter 6.2-6.3  20 points

Extra Credit for Chapter 10.1/11: 10 points

Bring a calculator, writing instruments, identification, and a formula sheet (both sides allowed).  You are not allowed to eat during the final, you may bring something to drink (non-alcoholic) however.

I am not allowed to reveal grades via e-mail or phone.  If you want to know yours as soon as possible, leave a grade card with me.  Otherwise, the grades will be posted in a timely manner.

What follows are 11 problems, the final is not this long, it’s just extra practice.  Good luck.

1.         A poll on women's issues interviewed 1,025 women and 472 men randomly selected from the United States. The poll found that 47% of the women said they do not get enough time for themselves.

(a)        Construct an exact 90% confidence interval for the percentage of women who say they do not get enough time for themselves.

 

 

 

(b)        Your friend is taking Statistic 11 next year and unfortunately they are enrolled at 8am with Professor Lew and they ask you for help all of the time. Explain to your friend why we can't just say that 47% of all adult women in the U.S. do not get enough time for themselves.

 

 

 

 

 


2. The following comes from a recent article in The Wall Street Journal :

Generation X-ers Aren't Relying On the Survival of Social Security

BY JOHN SIMONS

According to the most recent Wall Street Journal/NBC News poll, only 39% ofX-ers believe that Social Security will still be able to provide benefits when they retire. That compares to recent surveys of all Americans which show that 45% think so.

Assume that 45% figure is a stable, long-run, historical fact about American beliefs about Social Security benefits. Also assume that the survey of Generation X-ers had 121 respondents.

A.        Test the hypothesis that the belief that Social Security will still be able to provide retirement benefits has decreased over time. State the null and the alternative, perform a test, and state a p-value. Please use a 5% level of significance as your decision rule. On the basis of your test results, do you think that Generation X-ers are like other Americans in their beliefs about social security or are they different?

 

 

 

 

 

 

 

 

 

B.         Suppose in a few years the Wall Street Journal decided to replicate this study (i.e.draw a new sample) on Generation Y-ers (that's you all, I think...). Let's assume that the 39% figure is now the stable, long run fact about belief in Social Security benefits by Americans.

What is the chance that a sample of 64 will have at least 30% of the surveyed Generation Y-ers believing in Social Security?

 

 

 

 

 

 

 

 

3.     A simple random sample of 100 stocks was drawn from the entire market. The average return was 13%, and the SD was 6%; furthermore, the distribution of percentage returns in the sample was close to normally distributed.

 

Based on these data, is possible to construct a 95% confidence interval for the percentage of stocks in the market as a whole that had percentage returns greater than 20%? 

 

Please answer yes or no.  If you answer yes, please construct a confidence interval and please explain why this is possible to do.  If you answer no, please explain why it is inappropriate to construct a confidence interval given this information.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4. 1996 was a particularly good year for the stock market.  For 1996 as a whole, the mean return on all common stocks on the NYSE (New York Stock Exchange) was m = 16.4%.  The standard deviation was about s = 36%.  Assume the distribution of annual returns is roughly normal.

 

(a)    Suppose you selected 9 stocks at random from the NYSE stocks in 1996.  What is the expected value (mean) and standard deviation of the returns of randomly chosen portfolios of 9 stocks?

 

 

 

 

 

 

 

 

 

 

(b)   What percentage of such portfolios of 9 stocks will lose money (i.e. have returns of zero or less)?

 

 

 

 

 

 

 

 

 

 

 

5.  Here are two statistics on all persons who consider themselves investment bankers in 1997:

 

$820,000 dollars per year

$141,000 dollars per year

 

Which one of these numbers is the mean salary from investment banking and which one is the median salary from investment banking in 1997?  Assume the samples were of good quality.

 

 

The mean is _______________________________

 

 

 

The median is______________________________

 

Explain your choice in the space below.  Be brief.  This is not a long answer.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


6. You got a job at an insurance company.  The company advertises that it processes 90% of its claims on time, that is, within 5 working days of initial receipt. The average processing time of all claims is 2.7 days with a standard deviation of 0.6 days. You have been asked to select and audit a simple random sample (SRS) of 42 of the tens of thousands of claims received in the past month. The audit reveals that 37 of the 42 claims were processed within 5 working days with an average of 2.9 days and a standard deviation of 1.1 days.

 

(a)  What are the mean and standard deviation for the number of claims processed on time?

 

 

 

 

 

 

 

 

 

 

(b)  An angry client thinks the processing time of 2.7 days is a lie and in reality, the company is slower than it advertises.  Please test the hypothesis that the true processing time is 2.7 days against the client's alternative.  Is there sufficient evidence to reject the null at alpha=.05?

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


7.Here is some Stata output:

 

. regress  totgross  openday

 

  Source |       SS       df       MS                  Number of obs =     907

---------+------------------------------               F(  1,   905) =  278.78

   Model |  4.9206e+17     1  4.9206e+17               Prob > F      =  0.0000

Residual |  1.5974e+18   905  1.7651e+15               R-squared     =  0.2355

---------+------------------------------               Adj R-squared =  0.2347

   Total |  2.0895e+18   906  2.3062e+15               Root MSE      =  4.2e+07

 

------------------------------------------------------------------------------

totgross |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]

---------+--------------------------------------------------------------------

 openday |   23219.24   1390.652     16.697   0.000       20489.96    25948.51

   _cons |   714478.1    2426230      0.294   0.768       -4047214     5476170

------------------------------------------------------------------------------

 

totgross is  Total gross receipts for a movie

openday was the number of screens the movie was being shown on its first day of viewing

 

a.              Write out the regression equation estimated by Stata.  Identify the slope and intercept clearly.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

b.              What is the interpretation of the coefficient for the variable openday?  Please state it in plain English and include how it relates to totgross.

 

 

 

 

 

 

 

 

 

 

 

 

c.              What is the correlation for totgross and openday?  Please interpret its value in terms of both strength and direction.

 

 

 

 

 

 

 

 

 

 

d.              A movie titled "How the Grinch Stole Christmas" opened on 3,127 screens across the United States on November 17, 2000.  From this information, can you give me an estimate of its total gross receipts?  Answer yes or no and then give me a numerical solution.

 

 

 

 

 

 

 

 

 

 

 

 

e.              Suppose I told you that a movie titled "Miss Congeniality" had total gross receipts of $103,271,534.  From this information, can you tell me how many screens it was being shown on its opening day of December 22, 2000?  Answer yes or no.  If you answer yes, please give a numerical solution and show your work.  If you answer no, please explain why this is not possible.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


8.           High Bias and High Variance are both considered undesirable features of certain sample statistics (such as a sample mean for example).  You are working with a team on a marketing study, a sample of size 100 is drawn.  One of the variables you are interested in is the average time spent on the internet on any day.  You plan to construct confidence intervals and perform some unspecified hypothesis tests. Studies always have problems, and today you have your choice: High Bias or High Variance.  Which one would you rather deal with and why?

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


9. You work for an investment bank and you are evaluating two companies A and B.  Both of them are in biotechnology and each company has invested a significant amount of money in a number of projects.

 

Here is some information on the two companies:

 

Company A currently has 10 projects under development, each project cost the company $100,000.  Suppose when completed, each project has the following distribution: a 50% chance that the project will be worth $500,000, a 20% chance that it will be worth $100,000 otherwise, it will be worth nothing.

 

Company B currently has 20 projects under development, each project cost the company $200,000.  Suppose each project when completed has the following distribution: a 60% chance that the project will be worth $400,000 otherwise, it will be worth nothing.

 

Assume the projects are independent and when completed and with the price of the initial investment factored in, the total of the projects will represent the total net worth of each company.

 

(a)   What are the expected net worths for Company A and Company B?

 

 

 

 

 

 

(b)  What are the Standard Deviations for Company A and for Company B?

 

 

 

 

 

 

 

( c ) Your bank is trying to determine whether a merger of the two companies is worth the time and expense.  They have called you in to make the call, here is the information you need: the bank thinks the merger should occur if there is a greater than 25% chance that the combined net worth of the companies exceeds $5,000,000.  Answer “merge” or  “do not merge” and show the work that lead you to your decision.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


10.   Over his  career, a basketball player has made 1,210 free throws and missed 214.  The basketball player is about to shoot the ball again, what is his estimated probability of making a free throw (i.e. getting the ball into the basket)?

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

11.   A manufacturer of “zip” floppies advertises that 95 out of 100 floppies will have no problems.  Suppose you buy a pack of 16 from a discounter and find that one fails to work.  If the manufacturer's claim is true, what is the probability that one or more floppies fail in a pack of 16?