|
EDA in an expanded field of statistics
Announcing a new seminar in the non-professional practices
UCLA Center for Statistical Computing
As a result of our improved abilities to "observe," professional and research practices are becoming increasingly dependent on data and data processing, on drawing conclusions from or in some way adapting to rich flows of measurements taken from the physical or virtual worlds. These professional demands have given rise to a host of new analysis tools, new methodologies and new software, for uncovering significant structures in data. While many of these advances were initiated in industrial or academic settings, we are starting to see the (inevitable?) migration of these technologies from labs and specialized deployments into widespread usage by the general public. The most obvious case in point is the trajectory followed by Geographic Information Systems; powerful mapping and overlay tools are available in a variety of convenient platforms and have been quickly taken up by non-specialists and applied effectively for social, political and cultural ends. The same can be said for database technology, with new Web sites and services like Dabble DB and Swivel offering powerful, exceedingly user-friendly tools for storing, manipulating, and importantly, sharing data; or Many Eyes, a site that offers a kind of "social data analysis" by making relatively sophisticated graphical tools easily available, and applying a social network model to encourage interaction around the displays. We should emphasize that this migration is not purely a "server-side" phenomenon, impacting storage and analysis tools. Powerful observation technologies, data collection platforms, are already in the hands (and pockets) of millions of Americans. The mobile phone network represents a sensing system with billions of "nodes" globally, capable of capturing text, audio, images and video. Mobile phone manufacturers are busy extending the capabilities of these devices, extending their sensing capabilities. In parallel, the advances in academic sensor network research will soon provide a range of affordable, easy-to-use, low power observing systems to the public. An announcement. Just over 30 years ago, John Tukey literally wrote the book on Exploratory Data Analysis, EDA. In a way that Tukey could not have imagined, data collection and analysis technologies are moving quickly into the public realm, creating a new kind of statistics, one that has emerged without the obvious involvement of statisticians. As information technologies have brought "the network" into our homes and personal spaces, new kinds of non-professional data collection and analysis practices have developed, practices that invite participation and data sharing. In this expanded field of statistics, EDA is transformed. In the 2008-2009 academic year, we will be sponsoring a seminar as well as a lecture series to examine how and why non-statisticians are grappling with the effects of large, complex data flows, and the implications (both technological and ethical) of their work. These events will build on our experiences with "Site-Specifics," a seminar, offered in 2005-2006 and 2006-2007, that investigated the effects of data and data processing by studying specific places within Los Angeles. While Site-Specifics focused mainly on professional applications (healthcare, education, environmental management) the new lecture series will examine the collection, presentation and discussion of data in the public sphere. Our ultimate goal is a book that will describe the "best practices" of a new EDA. |