1. Hw. 2. R Cookbook ch 1-3. 3. pi.r. 1. Hw1 is due Tue Oct 9 by email. See the course website, http://www.stat.ucla.edu/~frederic/202a/F12 . Read ch. 5-6 for next class. 2. R Cookbook. The top of p27 gives a simple example of a function. f <- function(n,p) sqrt(p*(1-p)/n) f(4, .5) You can also make functions have default parameter values. f = function(n,p=.5) sqrt(p*(1-p)/n) f(4,.5) f(4) f(4,p=.5) All 3 of the above are equivalent. pp31-32, he notes that mean(dframe) and sd(dframe) go column by column, which is usually what you want. var() gives the covariances between columns. Same with cov(). x = matrix(c(1,2,3,1,4,7,1,5,1),ncol=3) ## by default, it inputs them column by column. cov(x) var(x) p33, seq() and : are useful. So is rep(). 10:50 x = seq(from=1,to=50,by=2) x length(x) x = seq(from=1,to=50,length=25) ## to be sure to hit 50 exactly. x length(x) rep(1,5) rep(c(1,3),5) rep(c(1,3,5),c(1,2,3)) rep(c(1,3,5),c(1,0,3)) rep(c(1,3,5),times = c(1,0,3)) p34, a == pi evaluates whether a and pi are equal. If they're vectors, the result is a vector of trues and falses. != means not equal. <= means less than or equal to, but <- means assigment. x = 1:5 x = 3 x x = 1:5 x == 3 x[x==3] x x[x!=3] You can also take out one element of x, using -. x = seq(1,51,by=10) x x[-2] x[-c(2,5)] x[-c(1:6)[x > 40]] p35, any() and all(). I didn't know about those. Kind of interesting. There's also which(). x = c(3,1,4,3,5) any(x == 3) all(x == 3) which(x==3) p36, note that you can select not just one element, like fib[5], but also a collection, like fib[c(1,2,4,8)]. This is extremely useful. You can say fib[c(1,2,4,8)] = rep(0,4) to change these values to 0. fib = c(0,1,1,2,3,5,8,13,21,34) fib fib[c(1,2,4,8)] = rep(0,4) fib fib - 1 To change NA's to -1's you might say y = sqrt(fib-1) y y[is.na(y)] = rep(-1,sum(is.na(y))) y x = c(-1,3,5,8,-2) y = sqrt(x) y is.na(y) y[is.na(y)] = rep(0,length(is.na(y))) ## note the error here. length(is.na(y)) is the same as the length of y. y[is.na(y)] = rep(0,sum(is.na(y))) y Or, you can use the minus sign, as on p36. x = c(-1,3,5,8,-2) y = sqrt(x) c(1:5)[is.na(y)] ## or which(is.na(y)) z = y[-c(1:5)[is.na(y)]] ## or z = y[-which(is.na(y))] ## or z = y[which(!is.na(y))] z Look at c(1:5)[is.na(y)] again. If we just did 1:5[is.na(y)] ## this does the wrong thing and gives you a warning. Even easier, you can just use the ! sign. x = c(-1,3,5,8,-2) y = sqrt(x) z = y[!is.na(y)] z p38, note that arithmetic on vectors goes element by element. w = (1:5)*10 w w + 2 (w+2) * 10 p40 gives the order of operations in R. You might wonder what unary minus and plus are, and why they take precedence over multiplication or division. Unary means they just act on one element. For instance, the minus in the number -3 just acts on the 3. 3 * - 4 3 * + 4 Anyway, the most important example is the one he shows on p41, 0:n-1, when n = 10. This does 0:n first, so it creates the vector from 0 to 10, and then subtracts 1 from each element, so it's from -1 to 9. This is a VERY common mistake. n = 10 0:n 0:n-1 0:(n-1) (0:(n-1)) ## it doesn't hurt to put parentheses around : expressions. p41, %in% is interesting. c(1:3) %in% c(14:24) c(1:3) %in% c(14:24,2) p42, functions are really important. They return the LAST expression, or you can specify return(x). Variables in a function are local. x in a function is different from x outside the function. x = 3 f2 = function(n){ x = 4 n*x } f2(5) x f2 = function(n){ x <<- 4 n*x } f2(5) x p43, x <<- 3 makes it a global variable. ca = function(b){x <<- 3; x*b} x = 2 x ca(4) x p44 tells you how to open a new editor window. This does not always work on all platforms. I like to write all commands in alpha, my text editor of choice, and then copy and paste them into R. The common mistkes on p46-49 are a must read. A common mistake is using =, which is for assignment, instead of the logical function ==. x x[x=3] p47, the problem total = 1 + 2 + 3 + 4 + 5 is really common. This happens often when text editors wrap your code strangely, and then you cut and paste it. Note that if you do total = 1 + 2 + 3 + 4 + 5 then the problem goes away. R is smart about that. A related problem is this: total = 1 + 2 + 3 ## this will be the total of all my elements in my dataset p48, two common problems. aList[i] and aList[[i]], and & and &&. aList = list(w = rep(0,4), x = 1:10, y = rep(3,12), z = 1:5) aList[[2]] ## the second item in the list aList[2] ## a list containing the second item, and usually not what you want. x = aList[[2]] y = aList[2] x+1 y+1 mode(x) mode(y) mode() is kinda nice. mode(3.1415) is "numeric", mode("foo") is "character". help("&") "& and && indicate logical AND and | and || indicate logical OR. The shorter form performs elementwise comparisons in much the same way as arithmetic operators. The longer form evaluates left to right examining only the first element of each vector." p48, &, |, && and || are for logical arguments. x = c(1:10) (x<6) (x>3) ((x < 6) & (x > 3)) ## what you expect ((x < 6) && (x > 3)) ## evaluates only the first element. y = x^2 y if((x < 5) & (y < 5)) cat("good") ## Teetor says avoid this if((x < 5) && (y < 5)) cat("good") | means "or". Similar issues as with &. End of chapter 2.