Stats 120

Winter 2004



Instructor: Robert Gould
email: rgould@stat.ucla.edu
Phone: 310-206-3381 (x3381 from on-campus)
Office: MS 8945
Office Hours:  Monday 3-4 and by appointment

This course will focus on using linear regression.  This is an applications-oriented course,  and we will not be concerned with exploring the theory of the Linear Model.  However, we are interested in examining how well the linear model fits real-world processes, and our interest in examining differences between reality and the mathematical model will lead to discussions about the mathematical theory.  But for the most part our concern will be in fitting models to real data and learning techniques and strategies for building linear models.

This is not a course in mathematical statistics, and so there will be few proofs, and you will not be expected to prove theorems mathematically.  However, you will need to have enough mathematics under your belt to be comfortable working with equations and abstract notation.  In particular we will do a fair amount of work with matrix notation.

Required Text: Data Analysis and Graphics Using R, John Maindonald and John Braun. 
Recommended Text: Applied Regression Including Computing and Graphics, Dennis Cook and Sanford Weisberg.

Required Software: The software we will use is called R. You can download it here.   I chose the textbook primarily because it provides a good introduction to R, including instructions on where and how to get your own copy.  R runs on the Stats Department labs and best of all is free, so that you can install it on your home computer.  Read Chapter 1 of Maindonald text for details.  Note: R is a free-ware version of the rather expensive Splus.  Most things you read that are about Splus will also be true for R.

FINAL EXAM
    DIRECTORY OF STUDENT DATA SETS
    ALCOHOL DATA
Midterm Solutions
Homework
Data Sets
Outline
Handouts

Datasets Project


Along with your first midterm, you are being asked to turn in three datasets.  The "rules" are posted here.  Many of you have asked for further clarification, and so here is some clarification (I hope).

For your final, you will be asked to analyze someone else's data.  You will be assigned this data.  So the data you submit should
a) be analyzable -- this means enough points that we can make sense of it
b) should have a "story":  what can we learn? What patterns should we look for?
c) should have a context:  how did you collect these observations.  How can we know they were independent? Random samples? Think of the assumptions needed to do a statistical analysis, and try to tell us enough about the data so that we can answer these questions.
d) After I get them, I will return them to you with comments for changes, and we will need final versions two weeks later.