Statistics 202A
Background

Computing has always been an essential ingredient of statistical practice. While probability theory provides us with a mathematical foundation for describing data and studying statistical inference, computing technologies act as a medium through which analyses are actually realized. Our ability to manipulate data and to audition new methodologies depends on and is limited by our familiarity with computing technologies. To some extent, even our notion of what constitutes "data" is a product of our background in computing.

Through a series of group projects, we will study tools for "exploratory computing." We will emphasize programming and scripting languages over point-and-click interfaces. We hope to instill a problem solving ability so that you will learn languages on your own, cull online documentation or tutorials, find books and manuals.

Last Week

In our first meeting, we formed quarter-long work groups. In lecture, we began with Unix and "pipe" basics. We also spelled out our first two homework assignments.

This Week

We will hold extra (voluntary) lab sessions on Tuesdays, led by Ryan Rosario. These will begin this week. In lecture we will cover more Unix basics and so-called regular expressions, a language for expressing patterns in text.

Instructor    Mark Hansen
8951 Mathematical Sciences Building
University of California, Los Angeles
cocteau|@|stat.ucla.edu
www.stat.ucla.edu/~cocteau

Meeting    Thursdays 2:00-5:30
120 LaKretz


Office Hours    Tuesdays and Fridays TBD
(or by appointment)
8951 Mathematical Sciences


Grading   
50% Group projects
30% Short programming and writing tasks
20% In-class participation

Texts    The following books are only recommended, although will probably prove to be extremely useful references long after the course is over.
  • Unix in a Nutshell, by Robbins

  • Learning Python,
    by Lutz and ascher

  • Mastering Regular Expressions, by Friedl

  • Programming with Data, by Chambers

  • S Programming,
    by Venables and Ripley

  • Processing,
    site by Reas and Fry
Texts will be added to this list as the quarter progresses.

Resources       A list of computing resources and selected online articles is forming here.

Data    Datasets from lecture will be made available in an ongoing basis. Students are strongly encouraged to try some of the commands/programs/ideas discussed in lecture using these datasets.

Lecture 1: 1950.txt

Lecture 8: execs.csv

Lectures    Lectures will be posted in an ongoing basis with hardcopies handed out before each lecture.

Lecture 1 (Introduction and Unix basics)
Lecture 2 (Regular expressions, shell scripting)
Lecture 3 (Python introduction)
Lecture 4 (Python II)
Lectures 6 and 7 (Statistical Computing, R)
Lecture 7 (R II)