R commands for Composite Experiment in Chapter 2 for STAT c151/c225. NOTE: only type (or copy and paste) the R command after > Read data into R > data<-read.table("http://www2.isye.gatech.edu/%7Ejeffwu/book/data/composite.dat", h=T) > data X40 X50 X60 1 25.66 29.15 35.73 2 28.00 35.09 39.56 3 20.65 29.79 35.66 Arrange responses by row > y <- c(t(data)) > y [1] 25.66 29.15 35.73 28.00 35.09 39.56 20.65 29.79 35.66 One factor of 3 levels > d <- rep(c(40,50,60), 3) > d [1] 40 50 60 40 50 60 40 50 60 The ANOVA Table > g<-lm(y ~ as.factor(d)) > anova(g) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) as.factor(d) 2 224.184 112.092 11.318 0.009198 ** Residuals 6 59.422 9.904 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 The linear regression and model matrix (with baseline constraints) > summary(g) Call: lm(formula = y ~ as.factor(d)) Residuals: Min 1Q Median 3Q Max -4.120 -1.553 -1.253 2.577 3.747 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 24.770 1.817 13.633 9.67e-06 *** as.factor(d)50 6.573 2.570 2.558 0.04301 * as.factor(d)60 12.213 2.570 4.753 0.00315 ** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 3.147 on 6 degrees of freedom Multiple R-Squared: 0.7905, Adjusted R-squared: 0.7206 F-statistic: 11.32 on 2 and 6 DF, p-value: 0.009198 > model.matrix(g) (Intercept) as.factor(d)50 as.factor(d)60 1 1 0 0 2 1 1 0 3 1 0 1 4 1 0 0 5 1 1 0 6 1 0 1 7 1 0 0 8 1 1 0 9 1 0 1 attr(,"assign") [1] 0 1 1 attr(,"contrasts") attr(,"contrasts")$"as.factor(d)" [1] "contr.treatment" Scaled linear constrast > linear <-rep(c(-1,0,1),3)/sqrt(2) Scaled quadratic contrast > quadratic <-rep(c(1,-2,1),3)/sqrt(6) The linear-quadratic model > g<-lm(y ~ linear + quadratic) > summary(g) Call: lm(formula = y ~ linear + quadratic) Residuals: Min 1Q Median 3Q Max -4.120 -1.553 -1.253 2.577 3.747 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 31.032 1.049 29.583 9.9e-08 *** linear 8.636 1.817 4.753 0.00315 ** quadratic -0.381 1.817 -0.210 0.84083 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 3.147 on 6 degrees of freedom Multiple R-Squared: 0.7905, Adjusted R-squared: 0.7206 F-statistic: 11.32 on 2 and 6 DF, p-value: 0.009198 > anova(g) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) linear 1 223.748 223.748 22.593 0.003148 ** quadratic 1 0.436 0.436 0.044 0.840831 Residuals 6 59.422 9.904 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Or more directly, treat d is an ordered factor. The linear-quadratic contrasts (by forcing d as an ordered factor). > g<-lm(y ~as.ordered(d)) > summary(g) Call: lm(formula = y ~ as.ordered(d)) Residuals: Min 1Q Median 3Q Max -4.120 -1.553 -1.253 2.577 3.747 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 31.032 1.049 29.583 9.9e-08 *** as.ordered(d).L 8.636 1.817 4.753 0.00315 ** as.ordered(d).Q -0.381 1.817 -0.210 0.84083 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 3.147 on 6 degrees of freedom Multiple R-Squared: 0.7905, Adjusted R-squared: 0.7206 F-statistic: 11.32 on 2 and 6 DF, p-value: 0.009198 > x<-model.matrix(g) > round( cbind(x[,1], x[,2]*sqrt(2), x[,3]*sqrt(6)), 4) # clearer [,1] [,2] [,3] 1 1 -1 1 2 1 0 -2 3 1 1 1 4 1 -1 1 5 1 0 -2 6 1 1 1 7 1 -1 1 8 1 0 -2 9 1 1 1 The same ANOVA table > anova(g) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) as.ordered(d) 2 224.184 112.092 11.318 0.009198 ** Residuals 6 59.422 9.904 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Since d is quantitative, we can fit a quadratic model directly > g<-lm(y ~ d + I(d^2)) > summary(g) Call: lm(formula = y ~ d + I(d^2)) Residuals: Min 1Q Median 3Q Max -4.120 -1.553 -1.253 2.577 3.747 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -10.856667 54.537949 -0.199 0.849 d 1.077333 2.228972 0.483 0.646 I(d^2) -0.004667 0.022253 -0.210 0.841 Residual standard error: 3.147 on 6 degrees of freedom Multiple R-Squared: 0.7905, Adjusted R-squared: 0.7206 F-statistic: 11.32 on 2 and 6 DF, p-value: 0.009198 NOTE: Both d and d^2 are not significant if we look at their p-values from the summary. However, we shall use ANOVA as follows since the estimates are correlated. We see d^2 is not significant but d is. > anova(g) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) d 1 223.748 223.748 22.593 0.003148 ** I(d^2) 1 0.436 0.436 0.044 0.840831 Residuals 6 59.422 9.904 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Prediction at d=55 > predict(g, data.frame(d=55)) [1] 34.28 Since d^2 is not sign't, we shall drop it and fit a model with d only > g<-lm(y ~ d) > summary(g) Call: lm(formula = y ~ d) Residuals: Min 1Q Median 3Q Max -4.276 -1.479 -1.242 2.421 4.058 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.4989 6.0481 0.082 0.93657 d 0.6107 0.1194 5.115 0.00138 ** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 2.924 on 7 degrees of freedom Multiple R-Squared: 0.7889, Adjusted R-squared: 0.7588 F-statistic: 26.17 on 1 and 7 DF, p-value: 0.001376 The predicted value at d=55. > predict(g, data.frame(d=55)) [1] 34.08556