Homework


For all homework, please write your name and the homework number at the top of the first page.  Your homework MUST be stapled. You can buy a stapler for as little as $1.88, so there is no reason not to do so.


Previous HW Assignments

R-Tip of the week:

R tutorials:
Intro to basic commands
Intro to writing functions
Official R Introduction

HW 7 Due Friday, March 4 (note that this is in two weeks)

A.  
i) Using the mussel data (mussels.short) , fit a linear model using food level  to predict the thickness of the mussel beds.  Comment on how well the data fit the model.  No need to do adjustments; just check the diagnostics and comment. NOTE: to upload this data into R, you should type
whatevernameyouwant <- read.table("mussels.short", header=T)

ii) Use an added variable plot to determine which of these varibles: temp, waves, human.use, will have a contribution to predicting thickness of the mussel bed if we already know the food level.

B.
First, download the "fat" dataset.  This dataset consists of a series of R commands, and so to download it, all you need to do is save the file  in your working directory, start up R, and type source("fat.html").  Next, type ls()  and you'll see a data object named fat.  Type names(fat) and you'lll see the variables.

This homework exercise is taken from an article called "Fitting Percentage of Body Fat to Simple Body Measurements", Roger W Johnosn, Journal of Statistis Education v.4, n.1 (1996).   In a nutshell.....there is a fairly accurate way of measuring the percentage of body fat a person has, and it involves immersing that person in a large tank of water. This is obviously not practical for home use, or even the doctor's office.. And so it is expedient to have another method. This lab involves building a model that uses easy-to-measure aspects of a person to predict their percentage of body fat.

From the article:

Abstract


Percentage of body fat, age, weight, height, and ten body circumference measurements (e.g., abdomen) are recorded for 252 men. Body fat, one measure of health, has been accurately estimated by an underwater weighing technique. Fitting body fat to the other measurements using multiple regression  provides a convenient way of estimating body fat for men using only a scale and a measuring tape. This dataset can be used to show students the utility of multiple regression and to provide  practice in model building.

1. Introduction


1 A variety of popular health books suggest that readers assess their health, at least in part, by estimating their percentage of body fat.  Bailey (1994, pp. 179-186), for instance, presents tables of estimates based on age, gender, and various skinfold measurements obtained using a caliper.  Bailey (1991, p. 18) suggests that "15 percent fat for men and 22 percent fat for women are maximums for good health."  Behnke and Wilmore (1974, pp. 66-67), Wilmore (1976, p. 247), Katch and McArdle (1977, pp. 120-132), and Abdel-Malek, et al. (1985) are other sources of predictive equations for body fat. These predictive equations use  skinfold measurements, body circumference measurements (e.g., abdominal circumference), and, in the Abdel-Malek article, simply height and weight.  Gardner and Poehlman (1993, 1994) supplement these body measurements with a measure of physical activity to predict body density from which, as we shall see below, body fat can be estimated.

2 Such predictive equations for the determination of body fat  can be determined through multiple regression. A group of subjects is gathered, and various body measurements and an accurate estimate of the percentage of body fat are recorded for each. Then body fat can be fit to the other measurements using multiple regression, giving, we hope, a useful predictive equation for people similar to the subjects. The various measurements other than body fat recorded on the subjects are, implicitly, ones that are easy to obtain and serve as proxies for body fat, which is not so easily obtained.

3 In the dataset provided by Dr. A. Garth Fisher (personal communication, October 5, 1994), age, weight, height, and 10 body circumference measurements are recorded for 252 men. Each man's percentage of body fat was accurately estimated by an underwater weighing technique discussed below. A complete listing of the variables in the dataset appears in the Appendix.

2. Determination of the Percentage of Body Fat from Underwater Weighing


4 The percentage of body fat for an individual can be estimated from body density. As an approximation, assume that the body consists of two components -- lean tissue and fat tissue. Letting
     D = body density,
     W = body weight,
     A = proportion of lean tissue,
     B = proportion of fat tissue (so A + B = 1),
     a = density of lean tissue, and
     b = density of fat tissue,

 we have
     D = weight/volume
       = W/[lean tissue volume + fat tissue volume]
       = W/[A*W/a + B*W/b]
       = 1/[(A/a) + (B/b)]. 

Solving for B we find
     B = (1/D) * [ab/(a - b)] - [b/(a - b)].

5 Using the estimates a = 1.10 gm/cm^3 and b = 0.90 gm/cm^3 (see Katch and McArdle 1977, p. 111, or Wilmore 1976, p. 123), we come up with "Siri's equation" (Siri 1956):
Percentage of body fat (i.e., 100 * B) = 495/D - 450,

where D is in units of gm/cm^3. The dataset provided also gives a second estimate of body fat due to Brozek, Grande, Anderson, and Keys (1963, p. 137):
Percentage of body fat = 457/D - 414.2,

which is considered accurate for "individuals in whom the body weight has been free from large, recent fluctuations." There does not seem to be uniform agreement in the literature as to which of these two methods is best.

6 Volume, and hence the body density D, can be accurately measured in a variety of ways. The technique of underwater weighing "computes body volume as the difference between body weight measured in air and weight measured during water submersion.  In other words, body volume is equal to the loss of weight in water with the appropriate temperature correction for the water's density" (Katch and McArdle 1977, p. 113). Using this technique,
Body density = W/[(W - WW)/c.f. - LV],

     where

     W = weight in air (kg)
     WW = weight in water (kg)
     c.f. = water correction factor
            (equal to 1 at 39.2 degrees F because one gram of
            water occupies exactly one cm^3 at this temperature,
            equal to .997 at 76-78 degrees F)
     LV = residual lung volume (liters)

(Katch and McArdle 1977, p. 115). The dataset provided here contains the weights of the subjects, but not the values of the three other quantities. Other methods of determining body volume are given in Behnke and Wilmore (1974, p. 22 ff.).


Questions for problem B.

1. Examine the data and note any unusual cases.  What should be done about the unusual cases?  (The "about" file mentions one case in which is not hard to figure out how to correct one unusual case.)

2. Choose one of the two percentage of body fat estimates (Brozek or Siri).  Fit the percentage of body fat to some subset of the provided variables, but do not use density (which is too hard to measure). You need not describe every idea you try, but should justify your final choice using appropriate diagnostic tools and the like.

3. September 14, 1995 articles in The New England Journal of Medicine link high values of the adiposity index (weight/height^2), sometimes called the body mass index, to increased risk of premature death. See if this variable is useful in your model. Also try weight^1.2/height^3.3 as suggested in Abdel-Malek, et al. (1985).

4.  Comment on the predictive accuracy of your model.  For example, how far off is an individual likely to be?

5. Estimate the percentage of US men whose bodyfat is less than 15% (which some experts say is the maximum for good health).  What assumptions must you make?