1. Python and MySQL references. 2. Calling R from C. 3. Calling C++ from R. 3. Python and MySQL references. For Python, a nice introduction is http://www.rexx.com/~dkuhlman/python_101/python_101.html . There are many nearly equivalent online references. Another nice one is http://wiki.python.org/moin/BeginnersGuide For MySQL, a good reference is King, T., Reese, G., Yarger, R., and Williams, H.E. (2002). Managing and Using MySQL: Open Source SQL Databases for Managing Information and Web Sites, 2nd ed. O'Reilly and Associates, Inc., Sebastopol, CA. Or Sams Teach Yourself SQL in 10 Minutes, http://www.amazon.com/dp/0672325675, and websites that you might like are http://w3schools.com/php/php_mysql_intro.asp http://www.tizag.com/mysqlTutorial or http://www.keithjbrown.co.uk/vworks/mysql . 2. Calling R from C. Notes on this come from Phil Spector's lecture notes at UC Berkeley. The composite version of Simpson's rule is a rough way to approximate an integral of some function, f. It goes back to English mathematician Thomas Simpson (1710Ð1761), who approximated the integral from a to b of f(x) dx by [(b-a)/6] [f(a) + 4f((a+b)/2) + f(b)]. This is the approximation you'd get by just dividing the interval (a,b) into 6 pieces, and using the left endpoint for the first piece, the right endpoint for the last piece, and the middle, (a+b)/2, for the other 4 pieces. For the composite Simpson's rule, you divide the interval not into 6 pieces but n pieces, where n is even, and get the integral from a to b of f(x) dx ~ h/3 [f(a) + 4f(a+h) + 2f(a+2h) + 4f(a+3h) + 2f(a+4h) + ... + 4f(b-h) + f(b)], where h = (b-a)/n. Note that the coefficients 4 and 2 alternate. The error in Simpson's rule is bounded in absolute value by h^4 (b-a) max[f^4(x); x in (a,b)]/180. Suppose we want to write this function f in R, but run the composite Simpson's rule in C, and call the C function from R. The key component is a C function called call_R. call_R takes as one of its arguments a pointer to a function, which is passed into the C environment as a list. The number and type of the arguments to f are also passed to call_R. As usual, when calling C from R, the results of the C function in the end are passed back to R not as a returned value, but through the argument list. It is typically easiest to divide the C program into three parts. 1) simp is the algorithm to implement the composite Simpson's rule, but it doesn't deal with issues pertaining to the interface between C and R. 2) sfunc is just a shell that uses call_R to have R evaluate the function f. 3) dosimp is just the interface, which is called by R and which calls simp. In simp.c, static char *sfunction; double simp(double(*func)(),double start,double stop,long n) { double mult,x,t,t1,inc; long i; inc = (stop - start) / (double)n; x = start; t = func(x); mult = 4.0; for(i=1; i #include void pi2 (int *n, double *y){ int i; double x[*n]; x[0] = 1.0; y[0] = sqrt(6.0); for(i = 1; i < *n; i++) { x[i] = x[i-1] + 1.0 / ((i+1.0)*(i+1.0)); /* or x[i] = x[i-1] + 1.0 / pow(i+1.0,2.0); */ y[i] = sqrt(6.0 * x[i]); } } system("R CMD SHLIB mypi.c") dyn.load("mypi.so") b = .C("pi2",as.integer(1000000), y = double(1000000)) b$y[1000000] I took the same file and called it mypi5.cpp. system("R CMD SHLIB mypi5.cpp") dyn.load("mypi5.so") b = .C("pi2",as.integer(1000000), y = double(1000000)) b$y[1000000] So, C code works in C++. Now let's change some things in mypi5.cpp. #include #include void pi2 (int *n, double *y){ // int i; // I removed this. double x[*n]; x[0] = 1.0; y[0] = sqrt(6.0); int i; // and added it here. for(i = 1; i < *n; i++) { x[i] = x[i-1] + 1.0 / ((i+1.0)*(i+1.0)); /* or x[i] = x[i-1] + 1.0 / pow(i+1.0,2.0); */ y[i] = sqrt(6.0 * x[i]); } } system("R CMD SHLIB mypi5.cpp") dyn.load("mypi5.so") b = .C("pi2",as.integer(1000000), y = double(1000000)) b$y[1000000] If we tried this in C, we'd get an error. Next we will discuss fitting models by maximum likelihood, using optim() in R, and C for the likelihood function.