This project is about characterizing change; about expressing
how data evolve over time. Working in groups, students will
collect data for approximately
one month.
Each group will eventually design an automated collection
system that will "scrape" data of some form from the Web. Groups
are encouraged to be creative about their choice of data, with the
only restriction being that the data must relate to the upcoming
Presidential election. While data collection will initially proceed manually,
we will begin to very quickly provide students with the tools to
both collect and to process data in an automated fashion.
Because groups will eventually automate their processing, I would
not encourage students to spend a lot of time with their manual
analyses. The goal of this exercise is not to make the (obvious) point that
a computer can conduct analysis faster than a human can, but that
computing can change what we view as data (perhaps equally obvious, but
certainly less appreciated).
It is also important that students get a sense of what it means
to "own" a data feed and to feel the daily pressure to keep it current.
At the end of the quarter, each group will present to the class the
data they chose and the systems they developed to collect, store and process the
data. They will also present an analysis of the data, indicating
how characteristics of the feed changed over time, and in response
to events like the scheduled debates or (unforseen) incidents in
the U.S. and abroad. Grading will be based on the decisions the students
made while designing their system and on the analysis of the culled
data.