R commands for Pulp Experiment in Chapter 2 for STAT c151/c225. NOTE: only type (or copy and paste) the R command after > Read data into R > data<-read.table("http://www2.isye.gatech.edu/%7Ejeffwu/book/data/pulp.dat", h=T) > data A B C D 1 59.8 59.8 60.7 61.0 2 60.0 60.2 60.7 60.8 3 60.8 60.4 60.5 60.6 4 60.8 59.9 60.9 60.5 5 59.8 60.0 60.3 60.5 The grand mean > mean(data) [1] 60.4 The sample mean for each treatment > apply(data, 2, mean) A B C D 60.24 60.06 60.62 60.68 Arrange the response by row > y<-c(t(data)) > y [1] 59.8 59.8 60.7 61.0 60.0 60.2 60.7 60.8 60.8 60.4 60.5 60.6 60.8 59.9 60.9 [16] 60.5 59.8 60.0 60.3 60.5 The design matrix for one-way layout (i.e., one factor of 4 levels) > d<-as.factor(rep(LETTERS[1:4], 5)) > d [1] A B C D A B C D A B C D A B C D A B C D Levels: A B C D We can use R to do regression and anova with suitable contrast coding for treatment factors. The zero-sum contraint is known as "contr.sum" in R. > options(contrasts=c("contr.sum", "contr.poly")) > contrasts(d) [,1] [,2] [,3] A 1 0 0 B 0 1 0 C 0 0 1 D -1 -1 -1 The linear model and estimates > g<-lm(y ~d) > g Call: lm(formula = y ~ d) Coefficients: (Intercept) d1 d2 d3 60.40 -0.16 -0.34 0.22 The model matrix > model.matrix(g) (Intercept) d1 d2 d3 1 1 1 0 0 2 1 0 1 0 3 1 0 0 1 4 1 -1 -1 -1 5 1 1 0 0 6 1 0 1 0 7 1 0 0 1 8 1 -1 -1 -1 9 1 1 0 0 10 1 0 1 0 11 1 0 0 1 12 1 -1 -1 -1 13 1 1 0 0 14 1 0 1 0 15 1 0 0 1 16 1 -1 -1 -1 17 1 1 0 0 18 1 0 1 0 19 1 0 0 1 20 1 -1 -1 -1 attr(,"assign") [1] 0 1 1 1 attr(,"contrasts") attr(,"contrasts")$d [1] "contr.sum" The ANOVA table > anova(g) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) d 3 1.34000 0.44667 4.2039 0.02261 * Residuals 16 1.70000 0.10625 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 The baseline constraint is known as "contr.treatment" in R. > options(contrasts=c("contr.treatment", "contr.poly")) > contrasts(d) B C D A 0 0 0 B 1 0 0 C 0 1 0 D 0 0 1 The linear model and estimates > g<-lm(y ~ d) > g Call: lm(formula = y ~ d) Coefficients: (Intercept) dB dC dD 60.24 -0.18 0.38 0.44 > model.matrix(g) (Intercept) dB dC dD 1 1 0 0 0 2 1 1 0 0 3 1 0 1 0 4 1 0 0 1 5 1 0 0 0 6 1 1 0 0 7 1 0 1 0 8 1 0 0 1 9 1 0 0 0 10 1 1 0 0 11 1 0 1 0 12 1 0 0 1 13 1 0 0 0 14 1 1 0 0 15 1 0 1 0 16 1 0 0 1 17 1 0 0 0 18 1 1 0 0 19 1 0 1 0 20 1 0 0 1 attr(,"assign") [1] 0 1 1 1 attr(,"contrasts") attr(,"contrasts")$d [1] "contr.treatment" > anova(g) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) d 3 1.34000 0.44667 4.2039 0.02261 * Residuals 16 1.70000 0.10625 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Note that the linear models and model matrices are different for the two constraints; however, the fitted values, residuals and the anova table are the same. R commands for plotting Figures 2.2, 2.4 and 2.7. > plot(g$fit, g$res) > plot(as.numeric(d), g$res) > qqnorm(g$res) Check this plot > plot(d, g$res)