1. Hw. 2. R Cookbook ch 1-3. 3. pi.r. 1. Hw1 is due Thur Oct 6 by email. See the course website, http://www.stat.ucla.edu/~frederic/202a/F11 . Read ch. 5-6 for next class. 2. R Cookbook. The top of p27 gives a simple example of a function. f <- function(n,p) sqrt(p*(1-p)/n) ls() [page 27] is useful. When you load a package, though, the functions don't appear in ls(). mode() [p51] is kinda nice. mode(3.1415) is "numeric", mode("foo") is "character". pp31-32, he notes that mean(dframe) and sd(dframe) go column by column, which is usually what you want. var() gives the covariances between columns. Same with cov(). p33, seq() is useful. seq(from=1,to=5,by=2) for instance. rep(1,5) is really useful too. p34, a == pi evaluates whether a and pi are equal. If they're vectors, the result is a vector of trues and falses. != means not equal. <= means less than or equal to, but <- means assigment. x = c(1:5) x = 3 x x = 1:5 x == 3 x[x==3] x x[x=3] p35, any() and all(). I didn't know about those. Kind of interesting. There's also which(). x = c(3,1,4,3,5) which(x==3) p36, note that you can select not just one element, like fib[5], but also a collection, like fib[c(1,2,4,8)]. This is extremely useful. You can say fib[c(1,2,4,8)] = rep(0,4) to change these values to 0. For instance, to change NA's to 0's you might say y[is.na(y)] = rep(0,sum(is.na(y))) x = c(-1,3,5,8,-2) y = sqrt(x) y is.na(y) y[is.na(y)] = rep(0,length(is.na(y))) ## note the error here. length(is.na(y)) is the same as the length of y. y[is.na(y)] = rep(0,sum(is.na(y))) y Or, you can use the minus sign, as on p36. x = c(-1,3,5,8,-2) y = sqrt(x) c(1:5)[is.na(y)] ## or which(is.na(y)) z = y[-c(1:5)[is.na(y)]] ## or z = y[-which(is.na(y))] ## or z = y[which(!is.na(y))] z Look at c(1:5)[is.na(y)] again. If we just did 1:5[is.na(y)] ## this does the wrong thing and gives you a warning. Even easier, you can just use the ! sign. x = c(-1,3,5,8,-2) y = sqrt(x) z = y[!is.na(y)] z p38, note that arithmetic on vectors goes element by element. p40, you might wonder what unary minus and plus are, and why they take precedence over multiplication or division. Unary means they just act on one element. For instance, the minus in the number -3 just acts on the 3. 3 * - 4 3 * + 4 Anyway, the most important example is the one he shows on p41, 0:n-1, when n = 10. This does 0:n first, so it creates that vector from 0 to 10, and then subtracts 1 from each element, so it's from -1 to 9. n = 10 0:n 0:n-1 0:(n-1) (0:(n-1)) ## it doesn't hurt to put parentheses around : expressions. p41 %in% is interesting. c(1:3) %in% c(14:24) c(1:3) %in% c(14:24,2) p42, functions are really important. They return the last expression, or return(x). Variables in a function are local. x in a function is different from x outside the function. x = 3 f2 = function(n){ x = 4 n*x } f2(5) x f2 = function(n){ x <<- 4 n*x } f2(5) x p43, opening a new editor window is not always an option on all platforms. I like to write all commands in alpha, my text editor of choice, and then copy and paste them into R. p47, the problem total = 1 + 2 + 3 + 4 + 5 is really common. This happens often when text editors wrap your code strangely, and then you cut and paste it. Note that if you do total = 1 + 2 + 3 + 4 + 5 then the problem goes away. R is smart about that. A related problem is this: total = 1 + 2 + 3 ## this will be the total of all my elements in my dataset p48, two common problems. aList[i] and aList[[i]], and & and &&. aList = list(w = rep(0,4), x = 1:10, y = rep(3,12), z = 1:5) aList[[2]] ## the second item in the list aList[[2]] ## a list containing the second item, and usually not what you want. x = aList[[2]] y = aList[2] x+1 y+1 mode(x) mode(y) x = c(1:10) ((x < 6) & (x > 3)) ## right ((x < 6) && (x > 3)) ## wrong x = 3 y = 7 if((x < 5) & (y > 5)) cat("good") ## seems to work, but Teetor says avoid this if((x < 5) && (y > 5)) cat("good") | means "or". Similar issues as with &. p51, setting the working directory is really important. You need this to read files or write to files and know where they go. p54 search() lists all packages currently loaded [not just installed but loaded into the current session]. search() library(MASS) search() detach(package:MASS) search() p57, datasets. head(pressure) ## or just pressure or pressure[1,] data() ## lists all the preloaded datasets. instead of data(Cars93, package="MASS") you could just do package(MASS) data(Cats93) p58, library() lists all installed (but not necessarily loaded) packages. install.packages() is useful to install a new one. p63, source is sometimes useful though it can have problems, especially if your comments go over lines, as in the example with ## this will be the total of all the elements in my dataset.