Economics 40/Statistics M11 Lab 5: Scatterplots & Regression

Due at the start of your final, December 17, 1999

Purpose: The purpose of this lab is to use Excel to generate a scatterplot, run a regression and test a hypothesis.

Data: The movies and the colleges data sets, available from the web page at:

http://www.sscnet.ucla.edu/99F/econm40-1/

or

http://www.stat.ucla.edu/~vlew/stat11/datasets.html

Please do not procrastinate, get the data as soon as possible just in case the web server is not working.

ASSIGNMENT

You are Excel professionals now, or at least better with this program than your Econ 40/Stat 11 professor. I will only give you minimal guidance here. I think you are at the point where you do not need my screen captures any more. You can get help from others, but I want to see what you can produce if this was a real work assignment from your supervisor and not a lab. I will personally grade this lab and keep the best ones for my records.

  1. There are two datasets, one for movie receipts and video sales of the movies. The second is a college and university dataset on SAT scores and costs of attendance. A complete assignment means you will only need to work with the movie receipts and video sales data. If you wish to guarantee yourself a "+" grade on this lab, you can do some extra work with the college and university data (it is not required to receive a "+", but will guarantee a "+" on this lab)
  2. To complete the assignment, you will need to create a scatterplot of the movies data set. You can do this using the CHART WIZARD and selecting the "XY (Scatter)" type of graph. Please calculate a correlation (you can use the function CORREL or look under TOOLS: DATA ANALYSIS for "CORRELATION") for the movie gross and videos sold and give a verbal description of the plot.
  3. You will also need the TOOLS:DATA ANALYSIS commands one last time, this time choose "REGRESSION". Fill in the appropriate information, you only need to give Excel the appropriate range for the Y variable and the X variable. For this assignment, you do not need to concern yourself with any of the other options (e.g. residuals, plots, just forget it)
  4. For the movies and video sales database, what follows is the background information. A company that has distribution rights to home video sales of previously released movies would like to be able to estimate the number of video units that it can be expected to sell. For the 30 movies in the database, please answer the following questions:
  5. (a) use the least squares regression method to find the slope and intercept of a line which regresses sales of video units on the original box office gross

    (b) write out the regression equation using the information from the data analysis

    (c) interpret the meaning of the slope and intercept in the context of these two variables

    (d) suppose a movie has a gross of 20 million dollars, what is the predicted video sales for that movie?

    (e) suppose a movie has a gross of 100 million, what is the predicted video sales for that movie (can this/should this be done)?

    (f) Perform a Z-test to test the hypothesis that the slope is equal to zero, what is your conclusion?

  6. For the colleges database, this is a multiple regression situation. Please regress the annual total cost of attending the college/university on the average total SAT score and room and board. Here you have two X variables and one Y variable. For the 80 schools in the database, please answer the following questions:

(a) use the least squares regression method to find the slopes and the intercept of a plane which regresses annual total cost of attending on the average total SAT score and room and board

(b) write out the multiple regression equation using the information from the data analysis

(c) interpret the meaning of the slopes and the intercept in the context of these the variables used

(d) suppose a school has an average total SAT score of 1000, what is the predicted annual total cost for a for that school (assume its room and board are right on the average living costs for the 80)?

(e) UCLA's total SAT average is about 1240 and it has an average room and board of $6,490, what is the predicted annual total cost for a for UCLA? How does that compare with what you or your family is paying on an annual basis? I don't want to know exactly what you're paying/costing, but is the predicted total cost estimate too low/too high/about right?

(f) Perform a Z-test to test the hypotheses that the slopes are equal to zero, what is your conclusion?

SUMMARY

A complete assignment requires the following:

  1. One scatterplot of the movies and video database.
  2. A verbal description of the shape, direction and strength of its scatterplot
  3. Write or print out the equation and answer the questions in Part 4 above.

To absolutely guarantee a "+" grade on this lab, do Part #5 above. No scatterplot is necessary, just answer the questions in Part 5 after performing a regression analysis.