Data Sets

Blood Pressure Data from Cox & Snell, Applied Statistics: Principles and Examples, Example E.
    I have taken the liberty of making slight variations of the systolic blood pressures.  These are fabricated data intended to illustrate some "what if" scenarios.
    Variation I
    Variation II

Cut Shoots: from Chatfield: Two groups of plants measured for the concentration of some chemical.  One group (coded "1" in the data set) are cut shoots.  The other (coded "2") are rooted plants.  Is there a difference in the chemical concentration? Data are tab deliminated.  First column indicates whether cut shoots (1) or rooted (2). Second column is chemical concentration.

Florida County Votes:  Returns (before the recount) county by county for each candidate.  Also available as a stata file.

More Florida County Data: In the first (machine) recount, each county fed the ballots through the machines a second time.  For a given candidate, the result in each county is either +1 (the candidate's votes increased in the recount), 0 (no change), or -1 (votes went down).  Results are for Gore and Bush only.

Average Brain Weights  and explanation file.

Ozone in LA and explanation file.

Twins data: Data from a study that compares twins to examine the relation between education and income.

Alcohol Data from NHANES first wave (1971).  Random sample from US.  Includes (1) Did you have an alcoholic drink in the past 12 months (y/n), (2) Age class and (3) Education Class (higher numbers represent more education).  Data are SPACE DELIMITED.

Lead Data  Blood lead levels from study of 33 children whose parents work around lead and 33 control children. Data are TAB DELIMINATED.

Teaching Methods:  reading scores for 45 students, each randomly assigned to one of 5 groups.  Groups A and B: normal teaching method, Group C: praised. Group D: Reproved  Group E: Ignored.  DATA ARE TAB DELIMINATED.   Are some methods better than others? Or are all about the same?

Seed Germination:  To study the effect of moisture levels on seed germination, at each of 6 moisture levels (rated 1-11 on a non-linear scale), 8 boxes of 100 seeds were sown.  Half of the boxes had lids put on to hold in moisture content. At the end of two weeks, the number of germinated seeds was recorded.  TAB DELIMINATED.  Does the level of moisture make a difference?

Anesthetics: Four different anesthetics were tested (labelled A,B,C,D) for patients randomly assigned.  The time, in minutes, from reversal of anesthetic until the  eyes opened was recorded for each patient.   Are there differences among the anesthetics?

More Alcohol Data from NHANES first wave.  The first column is something called "qfi" which is an index of alcohol frequency. Roughly translated, the qfi represents the number of drinks per day, averaged out over about  a year. Don't ask me what the "q" stands for, although the "f" is frequency, and the "i" is index.
The next column is age (a categorical variable) and the final is education (the higher the number, the more years of education.)

Repeated Alcohol Data from the NHANES set.  The random sample of respondants in the first NHANES study (1971) were followed up in subsequent years ('82, '87, '92.)  These data contain their qfi scores, if available, for each of these years.  We are interested in describing how alcohol consumption patterns change over time.  (The qfi is an "alcohol consumption index" that roughly corresponds to the number of drinks PER DAY.  I say "roughly" because at no point is anyone asked how many drinks they have per day. Instead, they are asked how often they have "a" drink, and then, on those occaisons, how many drinks do they have.   And this is done separately for beer, wine, and liquor.   DATA ARE SPACE DELIMITED.

Sea Slugs -- data are the property of Patrick Krug, UCLA, and cannot be published or used without his permission.  Students of UCLA may use the data for classwork, but not for publication.  For permission, contact Dr. Krug at pkrug@protos.lifesci.ucla.edu.  The link takes you to an explanation file (with pictures!) and from there you can access the data.
 

Rainfall:  Data from p.54, Statistical Sleuth, Ramsey and Schafer.
Seeded = 1 means a cloud was not seeded
 =2 means the cloud was seeded
Rainfall measured in acre-feet in a day via radar

"data were collected in southern florida between 1968 and 1972 to test a
hypothesis that massive injection of silver iodide into cumulus clouds can
lead to increase rainfall.  (Data from J. Simpson, A. Olsen, and J. Eden,
"A Bayesian Analysis of a Multiplicative Treatment Effect in Weather
Modification", Technometrics 17 (1975): 161-166.)