Introduction to Statistical Methods for the Life and Health Sciences
|
Objective: This lab will allow you to explore the effects of randomization.
Download http://www.stat.ucla.edu/~rgould/datasets/m12s00.dta
This is a Stata object, and so it should load automatically.
Recall that this dataset includes data collected from Prof. Gould’s Stats M12 course in Spring 2000. Students were asked seven questions:
1) Gender (m/f)
2) Height (inches)
3) Weight (pounds)
4) Do you smoke? (yes = 1, no = 0)
5) Who do you want for President? Bushy, Gore, other
6) Rate your math ability: (1,2,3,4,5) 1 is much below average, 3 is average, 5 is much above average
7) Rate your math anxiety; 1 is much below average, 3 is average
In the previous lab, you examined the relationship between gender and math anxiety. What did you conclude? Suppose we now intend to conduct an experimental study on the whole sample, in which some subjects will receive tutoring in statistics. The goal of the experiment is to see if tutoring reduces math anxiety. Of course, people have different levels of math anxiety, and so it would be best if the treatment and control groups began the study with similar levels of math anxiety. If tutoring has a positive effect on math anxiety, we would expect that the treatment group would have a lower level of math anxiety than the control group by the end of the study.
1. What would happen if we decided to assign all the males to the treatment group (tutoring) and all the females to the control group? How might that assignment impact on our conclusions about the effect of tutoring on math anxiety? What if we assigned all the females to the treatment group and all the males to the control group? How might that assignment impact on our conclusions about the effect of tutoring on math anxiety?
The better way to assign subjects to treatment and control groups is to randomize by using an impartial, chance procedure. So let’s do that and see how our groups compare on this set of variables.
Type
Set seed # [sets the seed - specify a number with several digits]
Generate id = uniform() [creates a new variable called id; for each subject, id takes on a value between 0 and 1 with equal probability]
You’ve now randomized each subject into one of two groups (treatment and control). Let’s say that all subjects with id <=.5 are in the treatment group; and all subjects with id > .5 are in the control group. Let’s compare the two groups on all of the variables in the dataset. Do you expect the two groups to be similar on these variables or different?
2. Write down the sample average of each quantitative variable for the treatment and control groups.
For quantitative variables, type
Summarize varname if id <=.5 [mean of variable in “treatment group”]
Summarize varname if id >.5 [mean of variable in “control group”]
3. Write down the percentages of each category for the two groups.
For categorical variables, type
Tabulate varname if id <=.5 [percentages in “treatment group”]
Tabulate varname if id >.5 [percentages in “control group”]
4. How many subjects were assigned to your treatment group? If it wasn’t exactly 50%, explain why.
5. What percentage of your treatment group is female? How does that percentage compare to the percentage of females in the whole sample?
6. How does your treatment group compare to your control on math anxiety at the beginning of the study? What about the other variables?
7. Compare your results with your neighbor’s results. Explain any similarities and differences between your results and your neighbor’s results. What can you conclude about randomization?
Commands
Summarize x if varname==”level” [for categorical variables]
Summarize x if varname<=numeric level [for quantitative variables]
Tabulate x if varname ==”level” [for categorical variables]
Tabulate x if varname>numeric level [for quantitative variables]
Set seed #
Generate varname=uniform()