1. Hw3, Nov 3, final projects. 2. XCode 3. Compiling C and calling C from R. 4. Hello world. 5. Approximating pi in C. 6. dnorm in C. 7. sum of squared differences between observations. 8. sum of squared differences from observations for each gridpoint. 1. Hw3 and final projects. -- Hw3 is on the course website http://www.stat.ucla.edu/~frederic/202a/F11 and is due Tue, Nov 8, 1030am. Late homeworks will not be accepted! -- No class or office hours Thur Nov 3, due to the mandatory Statistics faculty retreat. -- Final project teams and oral report order. Tues, Nov 29 1 MENON, TULASI. 2 MONROE, SCOTT. 3 LIU, XUENING. ERLIKHMAN, GENNADY. DABAGH, SAFAA. 4 BEYOR, ALEXA. BANSAL, RAHUL. JOHNSTON, RACHEL. 5 MATTHEWS, JERRID. YING, VICTOR. GARAND, JOSEPH. 6 CHEN, YU-CHING. WANG, ZHAO. LEE, JONGYOON. 7 BAI, HAN. BRUMBAUGH, STEPHEN. DEL VALLE, ARTURO. 8 LY, DIANA. DALININA, RUSLANA. JIANG, FEIFEI. 9 GONYEA, TIANQING. TONG, XIAOXIAO. CHEN, LICHAO. Thur, Dec 1 10 MOLYNEUX, JAMES. YANG, HO-SHUN. YIN, KEVIN. 11 XIE, XIAOXI. PAVLOVSKAIA, MARIA. MORALES, EMMANUEL. 12 AGRAWAL, PALASH. CHEN, MENGNA. TSIOKOS, CHRISTOS. 13 CHANG, BO. 14 MCLAUGHLIN, KATHERINE. NG, LUNG. CRAWFORD, TIMOTHY. 15 CHEN, YUE. MCCARTHY, PATRICK. LAM, HO YEUNG. 16 LU, WEIPENG. WU, YI. LI, SHANSHAN. 17 WRIGHT, JENNIFER. CHEN, CHEN. SONG, XI. 18 EMBREE, JOSHUA. 19 GARCIA, EDUARDO. GORDON, JOSHUA. WONG, ALBERT. 2. XCode, for Mac OSX, includes a C and C++ compiler, among other tools. You can get XCode from http://wildfire.stat.ucla.edu/rick/xcode . Click on the one choice, xcode_3.2.6_and_ios, to get it. I will only leave it up there for a week or so. It takes a couple hours to download. If you have a PC, you will need a compiler. There are many free ones. See for instance http://www.compilers.net/dir/free/compilers/ccpp.htm . 3. Compiling C and calling C from R. The first step to writing C code is opening a text editor. Write your C code, call it something.c, then compile it, to create an object file called something.so. Then you can load that into R. It seems to primarily work with R from terminal, but you can run C in R in terminal, do save.image(), and then start R, use the same working directory, and then do load(".RData") . 4. Hello world. Create a C function to print "Hello world!" and call this function n times in R. 4a. In a text editor, create a C file. Say it's called hello.c The file looks like: #include #include void hello (int *n) { int i; for(i = 0; i < *n; i++) { Rprintf("Hello world!", "I'm here\n"); } } 4b. In UNIX, in same directory where hello.c is, type R CMD SHLIB hello.c Note: these are all "oh"s, not zeros, in this line above. or, in R, do system("R CMD SHLIB hello.c") 4c. In R, in the same directory, do dyn.load("hello.so") hello2 <- function(n){ .C("hello",as.integer(n)) } y = hello2(10) 5. Approximate pi in C. In mypi.c, #include #include void pi2 (int *n, double *y){ int i; double x[*n]; x[0] = 1.0; y[0] = sqrt(6.0); for(i = 1; i < *n; i++) { x[i] = x[i-1] + 1.0 / ((i+1.0)*(i+1.0)); /* or x[i] = x[i-1] + 1.0 / pow(i+1.0,2.0); */ y[i] = sqrt(6.0 * x[i]); } } In R, system("R CMD SHLIB mypi.c") dyn.load("mypi.so") pi3 = function(n){ .C("pi2",as.integer(n), y = double(n)) } b = pi3(1000000) b$y[1000000] Note that you have to be incredibly careful in C when doing arithmetic between integers and non-integers. If instead of 1.0 / ((i+1.0)*(i+1.0)); you do 1 / ((i+1.0)*(i+1.0)); or 1.0 / (i+1.0)^2; crazy stuff happens. ^ is a bitwise operator meaning "XOR", i.e. X^Y = 1 if X=1 or Y=1 but not both. 6. dnorm. You can access C versions of many basic R functions, including for instance dnorm(), rnorm(), etc. A nice reference is http://www.biostat.jhsph.edu/~bcaffo/statcomp/files/cprog3_ho.pdf . The syntax in C of dnorm is double dnorm(double x, double mu, double sigma, int give_log) in mydn.c, #include #include void norm2 (int *n, double *upper, double *bw, double *y){ int i; double x, inc; x = -1.0 * *upper; inc = 2.0 * *upper / *n; for(i = 0; i < *n; i++) { y[i] = dnorm(x / *bw, 0,1,0); x += inc; } } In R, system("R CMD SHLIB mydn.c") dyn.load("mydn.so") norm3 = function(n, u, b){ d = .C("norm2", as.integer(n), as.double(u), as.double(b), y = double(n)) d$y } b = .100 n = 100 u = 5*b a = norm3(n,u,b) title2 = paste("normal density with sd ", as.character(b)) plot(seq(-u,u,length=n), a, type="l", main=title2,xlab="x", ylab="f(x)") 7. Sum of squared differences between observations. In sumsq.c, #include #include void ss2 (double *x, int *n, double *y) /* x will be the vector of data of length n, and y will be a vector of squared differences from obs i to the other n-1 observations. */ { int i,j; double a; for(i = 0; i < *n; i++){ a = 0.0; for(j=0; j < *n; j++){ a += pow(x[i] - x[j], 2); } y[i] = a; } } in R, system("R CMD SHLIB sumsq.c") dyn.load("sumsq.so") sum3 = function(data2){ n = length(data2) a = .C("ss2", as.double(data2), as.integer(n), y=double(n)) a$y ## or equivalently a[[3]] } b = c(1,3,4) sum3(b) n = c(100, 1000, 2000, 3000, 5000, 7000, 8000, 10000) t2 = rep(0,8) for(i in 1:8){ b = runif(n[i]) timea = Sys.time() d = sum3(b) timeb = Sys.time() t2[i] = timeb-timea cat(n[i]," ") } plot(n,t2,ylab="time (sec)") *************** I stopped here. ******************* ## Now try the same thing in R, without C. sum4 = function(data2){ n = length(data2) x = rep(0,n) for(i in 1:n){ for(j in 1:n){ x[i] = x[i] + (data2[i] - data2[j])^2 } } x } b = c(1,3,4) sum4(b) n = c(1:8)*100 t3 = rep(0,8) for(i in 1:8){ b = runif(n[i]) timea = Sys.time() d = sum4(b) timeb = Sys.time() t3[i] = timeb-timea cat(n[i]," ") } plot(n,t3,ylab="time (sec)") 8. sum of squared differences from observations for each gridpoint. In sumsgrid.c, #include #include void ssg2 (double *x, int *n, double *g, int *m, double *y) /* x will be the vector of data of length n, g will be the vector of m gridpoints, and y will be a vector of length m, of squared differences from gridpoint i to the n observations. */ { int i,j; double a; for(i = 0; i < *m; i++){ a = 0.0; for(j=0; j < *n; j++){ a += pow(g[i] - x[j], 2); } y[i] = a; } } in R, system("R CMD SHLIB sumsgrid.c") dyn.load("sumsgrid.so") sum4 = function(data2, grid2){ n = length(data2) m = length(grid2) a = .C("ssg2", as.double(data2), as.integer(n), as.double(grid2), as.integer(m), y=double(m)) a$y ## or equivalently a[[5]] } b = c(1,3,4) gr = c(0,1,2,3,4,5) sum4(b,gr) This function ssg2 will be similar to your kernel density code, except that you will also have a bandwidth as an additional input, and you will replace the line a += pow(g[i] - x[j], 2); with the kernel density estimate. Coming up. Hawkes models. Kernel regression. Newton-Raphson optimization. MLE for Hawkes models.