The Data

 Data are at the heart of this case study:
 


 Before examining the data for information they may reveal, first think about what the data might
 look like and record your answers to these questions:

     1.Is the level of lead a continuous, discrete, or categorical variable?
 
 

     2.Describe what you think the histogram of lead levels for the Exposed group
        will look like.  ( For example, symmetric, right-skewed, or left-skewed).

Now download the data .  Stata should start up automatically.  Answer these questions and provide supporting evidence (graphs, summaries, etc.) as needed.

1. Describe the sample distributions for the Exposed, Control, and Dif variables.
 Are there outliers or gaps?

 2. Would the mean or the median be a better choice for describing the central
 tendency of each of the distributions?  Why?

 3.  Does there appear to be a relation between lead levels in the Exposed and Control
 groups?

 4.  Which group has the highest levels of lead?  Which graphic do you think best
 displays this?

 5.  Typically, how much higher or lower would you say the Exposed group's blood
 lead level is than the Control groups?
 

6. What percentage of the observations in the Exposed group are higher than any
 observations in the Control group?

 
7. What percentage of the 33 pairs have higher lead levels in the Exposed group than in the
 Control group?

8.  What's a typical difference between the lead levels of the Exposed group and the
 Control group?

9.  Based on these numerical summaries and the graphs, what would you conclude
 about the effects of lead in the parents' workplace on the child's blood lead level?

Continue