Introduction to Statistical Methods for the Life and Health Sciences
|
Objectives:
(1) Learn how to use a dataset that is in “raw” form
(2) learn how to use a dataset that already is in STATA
form
(3) practice summarizing datasets
(4) learn some new STATA commands
STATA commands that you
will find helpful today and in the future:
(For many of these commands,
you can type as many variables as you want. Feel free to experiment; nothing
bad will happen if you make a mistake.)
describe
graph varname1 varname2
chist varname1
summarize varname1 varname2, detail
help command
infile varname1 varname2 using “filename.raw”
use filename.dta
open filename.dta
list
clear
In order to test the
effectiveness of the drug captopril as a blood-pressure treatment, 15 patients
were given the same dose of the drug.
Their blood pressure was measured before and after taking the drug. Use this data from http://www.stat.ucla.edu/~rgould/datasets/bloodpressure.raw
To download this existing
“raw” file of data, type
infile before after using “http://www.stat.ucla.edu/~rgould/datasets/bloodpressure.raw”
There are two variables in
this dataset. The first variable
represents the systolic blood pressure before the drug (varname =
before), the second variable is the systolic blood pressure after the drug (varname
= after).
A. Type
summarize
for a summary of these two
variables. Describe the summary.
See what happens when you type
summarize before, detail
B. Make histograms of the before and after blood pressures.
Type
chist varname1
Compare the shapes of the
two histograms. Can you conclude
anything about the effectiveness of Captopril? Take into account the study
design and provide any evidence, numerical or graphical that you think support
your point.
Old Faithful is a geyser in
Yellowstone National Forest that is a popular tourist attraction. This geyser earned its name from the
regularity of its eruptions. This
regularity, of course, also makes it ideal for a tourist attraction. If a geyser erupts infrequently, and with no
discernible pattern, then how are tourists to know when to visit?
Use data from http://www.stat.ucla.edu/~rgould/datasets/oldfath.dta
Note that this is a STATA
object (it has the “.dta” ending).
To use this file in STATA,
type
use “http://www.stat.ucla.edu/~rgould/datasets/oldfath.dta”
C. Type
describe
How many variables are in
this dataset? What are the variable
names? How many observations in the
dataset?
D. Type
list
What does this command do?
E. The data record the length of time between eruptions (length) and
the length of the last eruption (duration).
Tourists usually want to know how long they will have to wait until an
eruption. Based on this data, what
would you tell them? Are there any
unusual features in the distribution of the times between eruptions?