1. Notes on Projects, and Python in R. 2. Newton-Raphson optimization for the MLE using optim(). 3. Building R packages. 1. Final projects. There have been a couple changes. Thu, Nov30 1. "FROELICH, AUSTIN" 2. "AGRAWAL, VANI" 3. "LIANG, SIYU" 4. "ARJON, ALEJANDRA" 5. "MEHTA, YASHVI MAYUR" 6. "QI, SHIRLEY" 7. "VO, DAVIS MINH" 8. "GOH, HAN GYUL" 9. "TANG, BRADLEY TIANYOU" 10. "CHEN, BAITING" 11. "CHEN, HAOTIAN" 12. "CHEN, GORDON" 13. "ASKARI, MOHAMMAD" Tue, Dec5 14. "VIDAL, JAY" 15. "ANTONY, KEVIN J" 16. "CHAE, JE HOON" 17. "ZHANG, YUELONG" 18. "PARK, JENNIFER Y" 19. "JIANG, FLORA" 20. "CHIANG, JENNIFER" 21. "XU, LI" 22. "KANG, ANDREA" 23. "WANG, CHUNHE" 24. "TANG, CHENCHEN" 25. "BATENI, BORNA" 26. "GOELTNER, KARL LIN DE HAN" Thu, Dec7 27. "SMITH, DANIEL" 28. "NANDY, SIDDHARTH" 29. "LI, KATHERINE" 30. "XU, JIAMIN" 31. "ILANGO, DHAKSHINA PRIYA RAJESWARI" 32. "CHEN, ALEX LICHY" 33. "TROXELL, DAVID JOSEPH" 34. "KHATIB, RAMZEY" 35. "BOLOURANI, ANAHITA" 36. "DSOUZA, REBECCA AURELIA" 37. "LI, HANNAH" 38. "BHAGWAT, VAIDEHI PRASHANT" 39. "FENG, JUSTIN ADAM" 40. "LIU, SILVIA" 41. "HE, HENGZHI" 42. "Park, Jennifer" -- Attendance is mandatory at your own day plus at least 1 of the 2 other days. -- Email me your slides as pdf or ppt the night before your presentation day. Regarding calling Python in R, there seems to be an R package called "reticulate" to do this. However, I cannot seem to install it on my mac. install.packages("reticulate") library(reticulate) It seems to use RcppTOML and I can't seem to install RcppTOML. It fails whenever I try. I tried all the suggestions online I could find, but none of them seemed to work for me. 2. Newton-Raphson optimization. With Newton-Raphson methods, or quasi-Newton methods, you start with an initial guess x_0 and try to find the minimum of a smooth function f(x). You start with x_0, then move toward x_1, then x_2, .... If f is smooth, then at the minimum x, f'(x) = 0. The Taylor expansion of f around x is f(x + delta) = f(x) + f'(x) delta + f''(x) delta^2 / 2! + .... Approximating this using just the first two terms is common. f(x + delta) ~ f(x) + f'(x) delta. So, if the derivative of this is 0 at the minimum, x, then that means f'(x) + f''(x) delta ~ 0. Therefore, delta ~ -f'(x) / f''(x). So, what quasi-Newton methods do is start with some guess x_0, then for i = 0, 1, 2, ..., let x_{i+1} = x_i + delta = x_i - f'(x_i) / f''(x_i). This extends readily to higher dimensional functions. The function optim() in R already does fast quasi-Newton optimization, because it calls fortran. Instead, what one could do is this. a) First, write a C function called neglog, to compute the negative loglikelihood. This function could take, as inputs, x, y, t, mag, n, xmin, xmax, ymin, ymax, tmin, tmax, mag0, theta, and res, where the first 4 are vectors of length n, indicating the longitudes, latitudes, times, and magnitudes of the data, n is an integer indicating the number of observations, the next 6 entries are double-precision numbers indicating the range of the observed space, mag0 is a a double-precision number which is the cutoff minimum magnitude of the data, theta = {mu, K, c, p, a, d, q} is a vector of 7 parameters, and res = the resulting negative loglikelihood. Have your function calculate the negative loglikelihood, and it's probably easier if you approximate the integral term numerically, using some large number of space-time locations around each point. Note that the integral term, º lambda(t,x,y) dt dx dy, = mu [tmax-tmin] [xmax-xmin] [ymax-ymin] + º (·g)dt dx dy, and º (·g)dt dx dy = ·ºg dt dx dy, so you can just sum, over all n points, ºg dt dx dy. The integral º lambda(t,x,y) dt dx dy should be approximately n when theta gets close to the true parameter vector, because E [º lambda(t,x,y) dt dx dy] = the expected number of points in the space. So, it's a good idea to include in the C function a line like Rprintf("The integral is %f .\n", int2); or replace int2 with whatever you are calling your approximation of the integral term. Also, note that the parameters in theta must be non-negative, and negative terms might do weird things to your negative loglikelihood function. So, your code should start with a bunch of if() statements, so that if any of the parameters is negative you simply return some large positive number, like 99999, for the negative loglikelihood. Create a function in R that simply calls neglog in C. Call your R function neglog2. Ideally, neglog2 should not take very long to run. b) To run optim() in R to minimize neglog2, you need an initial guess at theta. What I'd do is simply place a large number, like maybe 100,000 if that doesn't take too long, of points theta = {mu, K, c, p, a, d, q} at random, uniformly, in [0,100] x [0,100] x [0,10] x [0,10] x [0,100] x [0,100] x [0,10]. Then one could evaluate neglog2 at each of these thetas, and store the value of theta that minimizes neglog2. Call this value pstart. c) Then one could take a point process dataset, and using the initial guess pstart from b) and the function neglog2 in a), run optim() and record the final parameter estimates. You can also get approximate standard errors from optim. For instance, if you do fit2 = optim(pstart, neglog2, control=list(maxit=200), hessian=T) pend = fit2$par ## these are the parameter estimates. b3 = sqrt(diag(solve(fit2$hess))) ## these are the standard errors. optim() can be quite slow when calculating f is slow. Let's try out optim() a bit more. f = function(theta){ (theta[1]-2.1)^2 + (2.0+(theta[2]-2.2)^2)*(3.0+(theta[3]-2.3)^2) + (theta[4]-2.4)^2 + (theta[5]-2.5)^2 + (3.0+(theta[6]-2.6)^4)*(4.0+(theta[7]-2.7)^4) + (theta[8]-2.8)^2 } guess1 = rep(10,8) timea = Sys.time() g = optim(guess1,f,method="BFGS",control=list(maxit=200),hessian=T) Sys.time() - timea g$par sqrt(diag(solve(g$hess))) So it's fast, but if f is slow, then it slows down a lot. num = 0 f = function(theta){ a = runif(100000) b = sort(a) num <<- num + 1 cat(num," ") (theta[1]-2.1)^2 + (2.0+(theta[2]-2.2)^2)*(3.0+(theta[3]-2.3)^2) + (theta[4]-2.4)^2 + (theta[5]-2.5)^2 + (3.0+(theta[6]-2.6)^4)*(4.0+(theta[7]-2.7)^4) + (theta[8]-2.8)^2 } timea = Sys.time() f(guess1) Sys.time() - timea timea = Sys.time() g = optim(guess1,f,method="BFGS",control=list(maxit=200),hessian=T) Sys.time() - timea g$par g$counts Notice the large number of times f is called. It calls f not only every iteration, but also to approximate the derivatives and second derivatives. It's calling f about 10 times per iteration. 3. Building R packages. See separate pdf.