Stat 210B: Homework Assignment 5

Due:  Wednesday March 12

I.  The data cpus contains a relative performance measure and characteristics of 200 CPUs.  The variables are

     syct: cycle time in nanoseconds
     mmin: minimum main memory in kilobytes
     mmax: maximum main memory in kilobytes
     cach: cache size in kilobytes
     chmin:  minimum number of channels
     chmax: maximum number of channels
     perf:  published performance on a benchmark mix relative to an IBM 370/158-3

Build a regression tree to predict the performance by using log(perf) as the response and the other  variables as the predictors.
   syct=23, mmin=16000, mmax=32000, cach=64, chmin=16, chmax=32

II. The data fgl  has 138 rows and 10 columns. It was on fragments of glass collected in forensic work.  The variables are

    RI: refractive index
    Na: sodium
    Mg: manganese
    Al: aluminum
    Si: silicon
    K: potassium
    Ca: calcium
    Ba: barium
    Fe: iron
    type:  WinF, Veh, Con,  Tabl or Head.

Build a class tree to predict the type of glass.
RI=-1.69, Na=13.34, Mg=3.57, Al=1.57, Si=72.87, K=0.61, Ca=7.89, Ba=0, Fe=0

Hint: Your tree model should have at least one terminal node for each type.


Note:  To grow a larger tree, you can use option "cp" and/or "minsplit" in "rpart".  See help(rpart.control).