Homework
For all homework, please write your name and the homework number at the
top of the first page. Your homework MUST be stapled. You can buy
a stapler for as little as $1.88,
so there is no reason not to do so.
Previous HW Assignments
R-Tip of the week:
R tutorials:
Intro to basic
commands
Intro
to writing functions
Official R
Introduction
HW 7 Due Friday, March 4 (note that this is in two weeks)
A.
i) Using the mussel data (mussels.short) ,
fit
a linear model using food level to predict the thickness of the
mussel
beds. Comment on how well the data fit the model. No need
to
do adjustments; just check the diagnostics and comment. NOTE: to upload
this
data into R, you should type
whatevernameyouwant <- read.table("mussels.short", header=T)
ii) Use an added variable plot to determine which of these varibles:
temp,
waves, human.use, will have a contribution to predicting thickness of
the
mussel bed if we already know the food level.
B.
First, download the "fat"
dataset. This dataset consists of a series of R commands, and
so to download it, all you need to do is save the file in your
working directory, start up R, and type source("fat.html").
Next, type ls() and you'll see a data object named
fat. Type names(fat) and you'lll see the variables.
This homework exercise is taken from an article called "Fitting
Percentage
of Body Fat to Simple Body Measurements", Roger W Johnosn, Journal of
Statistis
Education v.4, n.1 (1996). In a nutshell.....there is a fairly
accurate
way of measuring the percentage of body fat a person has, and it
involves
immersing that person in a large tank of water. This is obviously not
practical
for home use, or even the doctor's office.. And so it is expedient to
have
another method. This lab involves building a model that uses
easy-to-measure
aspects of a person to predict their percentage of body fat.
From the article:
Abstract
Percentage of body fat, age, weight, height, and ten body circumference
measurements
(e.g., abdomen) are recorded for 252 men. Body fat, one measure of
health,
has been accurately estimated by an underwater weighing technique.
Fitting
body fat to the other measurements using multiple regression
provides
a convenient way of estimating body fat for men using only a scale and
a
measuring tape. This dataset can be used to show students the utility
of
multiple regression and to provide practice in model building.
1. Introduction
1 A variety of popular health books suggest that readers assess their
health,
at least in part, by estimating their percentage of body fat.
Bailey
(1994, pp. 179-186), for instance, presents tables of estimates based
on
age, gender, and various skinfold measurements obtained using a
caliper.
Bailey (1991, p. 18) suggests that "15 percent fat for men and 22
percent
fat for women are maximums for good health." Behnke and Wilmore
(1974,
pp. 66-67), Wilmore (1976, p. 247), Katch and McArdle (1977, pp.
120-132),
and Abdel-Malek, et al. (1985) are other sources of predictive
equations
for body fat. These predictive equations use skinfold
measurements,
body circumference measurements (e.g., abdominal circumference), and,
in
the Abdel-Malek article, simply height and weight. Gardner and
Poehlman
(1993, 1994) supplement these body measurements with a measure of
physical
activity to predict body density from which, as we shall see below,
body
fat can be estimated.
2 Such predictive equations for the determination of body fat can
be
determined through multiple regression. A group of subjects is
gathered,
and various body measurements and an accurate estimate of the
percentage
of body fat are recorded for each. Then body fat can be fit to the
other
measurements using multiple regression, giving, we hope, a useful
predictive
equation for people similar to the subjects. The various measurements
other
than body fat recorded on the subjects are, implicitly, ones that are
easy
to obtain and serve as proxies for body fat, which is not so easily
obtained.
3 In the dataset provided by Dr. A. Garth Fisher (personal
communication,
October 5, 1994), age, weight, height, and 10 body circumference
measurements
are recorded for 252 men. Each man's percentage of body fat was
accurately
estimated by an underwater weighing technique discussed below. A
complete
listing of the variables in the dataset appears in the Appendix.
2. Determination of the Percentage of Body Fat from Underwater
Weighing
4 The percentage of body fat for an individual can be estimated from
body
density. As an approximation, assume that the body consists of two
components
-- lean tissue and fat tissue. Letting
D = body density,
W = body weight,
A = proportion of lean tissue,
B = proportion of fat tissue (so A + B = 1),
a = density of lean tissue, and
b = density of fat tissue,
we have
D = weight/volume
= W/[lean tissue volume + fat
tissue
volume]
= W/[A*W/a + B*W/b]
= 1/[(A/a) + (B/b)].
Solving for B we find
B = (1/D) * [ab/(a - b)] - [b/(a - b)].
5 Using the estimates a = 1.10 gm/cm^3 and b = 0.90 gm/cm^3 (see Katch
and
McArdle 1977, p. 111, or Wilmore 1976, p. 123), we come up with "Siri's
equation"
(Siri 1956):
Percentage of body fat (i.e., 100 * B) = 495/D - 450,
where D is in units of gm/cm^3. The dataset provided also gives a
second
estimate of body fat due to Brozek, Grande, Anderson, and Keys (1963,
p.
137):
Percentage of body fat = 457/D - 414.2,
which is considered accurate for "individuals in whom the body weight
has
been free from large, recent fluctuations." There does not seem to be
uniform
agreement in the literature as to which of these two methods is best.
6 Volume, and hence the body density D, can be accurately measured in a
variety
of ways. The technique of underwater weighing "computes body volume as
the
difference between body weight measured in air and weight measured
during
water submersion. In other words, body volume is equal to the
loss
of weight in water with the appropriate temperature correction for the
water's
density" (Katch and McArdle 1977, p. 113). Using this technique,
Body density = W/[(W - WW)/c.f. - LV],
where
W = weight in air (kg)
WW = weight in water (kg)
c.f. = water correction factor
(equal
to 1 at 39.2 degrees F because one gram of
water
occupies exactly one cm^3 at this temperature,
equal
to .997 at 76-78 degrees F)
LV = residual lung volume (liters)
(Katch and McArdle 1977, p. 115). The dataset provided here contains
the
weights of the subjects, but not the values of the three other
quantities.
Other methods of determining body volume are given in Behnke and
Wilmore
(1974, p. 22 ff.).
Questions for problem B.
1. Examine the data and note any unusual cases. What should be
done
about the unusual cases? (The "about" file mentions one case in
which
is not hard to figure out how to correct one unusual case.)
2. Choose one of the two percentage of body fat estimates (Brozek or
Siri).
Fit the percentage of body fat to some subset of the provided
variables,
but do not use density (which is too hard to measure). You need not
describe
every idea you try, but should justify your final choice using
appropriate
diagnostic tools and the like.
3. September 14, 1995 articles in The New England Journal of Medicine
link
high values of the adiposity index (weight/height^2), sometimes called
the
body mass index, to increased risk of premature death. See if this
variable
is useful in your model. Also try weight^1.2/height^3.3 as suggested in
Abdel-Malek,
et al. (1985).
4. Comment on the predictive accuracy of your model. For
example,
how far off is an individual likely to be?
5. Estimate the percentage of US men whose bodyfat is less than 15%
(which
some experts say is the maximum for good health). What
assumptions
must you make?