Statistics 202A
Background

Computing has always been an essential ingredient of statistical practice. While probability theory provides us with a mathematical foundation for describing data and studying statistical inference, computing technologies act as a medium through which analyses are actually realized. Our ability to manipulate data and to audition new methodologies depends on and is limited by our familiarity with computing technologies. To some extent, even our notion of what constitutes "data" is a product of our background in computing.

Through a series of group projects, we will study tools for "exploratory computing." We will emphasize programming and scripting languages over point-and-click interfaces. We hope to instill a problem solving ability so that you will learn languages on your own, cull online documentation or tutorials, find books and manuals.

Upcoming Events

9/26    R Bootcamp, Department Orientation
9/30    Center for Statistical Computing Open House

This Week

In our first meeting, we will form quarter-long work groups; we will begin with Unix and "pipe" basics. We will hold extra (voluntary) lab sessions on Fridays for the first three weeks of the course to help students get acclimated to their computing environment.

Projects

Below we have a brief description of the first three projects for the course. A more complete listing of "deliverables" will be made as the course progresses (and the actual assignments are made). Each should take about two weeks, and each is a group project.

[ Map ] Exploring geographic data
Connect ] Wireless mobility on the Dartmouth Campus
[ Sent ] Following Enron's e-mail traffic from boom to bust
[ Scan ] Consider the "embedding" of R on a mobile robot that explores the environment, taking measurements of various quantities (light, temperature, pressure, CO2)
Live ] 1 year, 100 students, 350,000 hours of continuous data

Instructor    Mark Hansen
8951 Mathematical Sciences Building
University of California, Los Angeles
cocteau|@|stat.ucla.edu
www.stat.ucla.edu/~cocteau

Meeting    MW 4:00-5:20
A25 Haines Hall


Office Hours    Tuesday and Thursday TBD, Friday 2-4
(or by appointment)
8951 Mathematical Sciences


Grading   
20% Class participation
80% Projects and in-class presentations

Syllabus    PDF | HTML ]

Texts    The following books are only recommended, although will probably prove to be extremely useful references long after the course is over.
  • Unix in a Nutshell, by Robbins

  • Programming Perl,
    by Wall, Christiansen, Orwant

  • Learning Perl Programming,
    by Schwartz and Phoenix

  • Mastering Regular Expressions, by Friedl

  • Programming with Data, by Chambers

  • S Programming,
    by Venables and Ripley

  • Processing,
    site by Reas and Fry
Texts will be added to this list as the quarter progresses.

Resources       A list of computing resources and selected online articles is forming here.

Data    Datasets from lecture will be made available in an ongoing basis. Students are strongly encouraged to try some of the commands/programs/ideas discussed in lecture using these datasets. Data for the projects are available from each separate Project site.

Lectures    Lectures will be posted in an ongoing basis with hardcopies handed out before each lecture.

Lecture 1 (Introduction)
Lecture 2 (Unix basics -- corrected)
Lecture 3 (Regular expressions)
Guest Lecture: Neal Richman, Urban Planning, UCLA
Lecture 4 (Perl introduction -- corrected)
Lecture 5 (Perl and Judge Roberts -- corrected)
Lecture 6 (Perl wire tapping)
Lecture 7 (Dartmouth wireless project)
Lecture 8 (Wireless traces, Reference)
Lecture 9 (Final Perl lecture -- corrected)
Lecture 10 (A first look at R)
Lecture 11 (and a second look)
Lecture 12 (End of the thermal mapper)
Lecture 13 (More data manipulation in R)
Lecture 14 (Functions in R, the return of John Roberts)
Lecture 15 (Object oriented programming in R)
Lecture 16 (Debugging, packages, software licenses)
Lecture 17 (Introduction to databases)
Lecture 18 (Art and the relational model)
Lecture 19 (Processing)