1. Hw and final projects. 2. 2-d kernel smoothing. 3. Simulating from a density by rejection sampling. 4. Maps. 5. The rest of R Cookbook ch. 10. 1. Hw2 and final projects. -- Hw2 is on the course website http://www.stat.ucla.edu/~frederic/202a/F11 . Hw2 is due Tue, Oct 25, 1030am. Late homeworks will not be accepted! I will probably assign hw3 next class. -- For final projects, I will randomly assign people to groups next class. If you want Option B, you MUST email frederic@stat.ucla.edu before 8pm tomorrow, Oct 19, If I do not hear from you by 8pm on Oct 19, I will assume you will do Option A. See 202alec6.txt for the options. -- Read ch11, ch13.1 and ch13.2 in R Cookbook for next class. After that, we will switch to C. 2. 2-d kernel smoothing, using kernel2d() in splancs. install.packages("splancs") library(splancs) ## First, input 23 points using the mouse. n = 23 plot(c(0,1),c(0,1),type="n",xlab="longitude",ylab="latitude",main="locations") x1 = rep(0,n) y1 = rep(0,n) for(i in 1:n){ z1 = locator(1) x1[i] = z1$x y1[i] = z1$y points(x1[i],y1[i]) } ## 2-d kernel smoothing bdw = sqrt(bw.nrd0(x1)^2+bw.nrd0(y1)^2) ## default bandwidth b1 = as.points(x1,y1) bdry = matrix(c(0,0,1,0,1,1,0,1,0,0),ncol=2,byrow=T) z = kernel2d(b1,bdry,bdw) par(mfrow=c(1,2)) image(z,col=gray((64:20)/64),xlab="km E of origin",ylab="km N of origin") points(b1) x4 = (0:100)/100*(max(z$z)-min(z$z))+min(z$z) plot(c(0,10),c(.8*min(x4),1.2*max(x4)),type="n",axes=F,xlab="",ylab="") image(c(-1:1),x4,matrix(rep(x4,2),ncol=101,byrow=T),add=T,col=gray((64:20)/64)) text(2,min(x4),as.character(signif(min(x4),2)),cex=1) text(2,(max(x4)+min(x4))/2,as.character(signif((max(x4)+min(x4))/2,2)),cex=1) text(2,max(x4),as.character(signif(max(x4),2)),cex=1) mtext(s=3,l=-3,at=1,"density (pts/km^2)") 3. Simulating from a density by rejection sampling. To do simulation by rejection sampling, you need the density f(x) you're interested in to have the property f(x) < b g(x), where b is some constant and g(x) is some density that's easy to sample from. For instance, if f(x) is known to have some known maximum c, and finite support of length d, then you can use g(x) is the uniform distribution, and let b = cd. Basically, the way it works is you simulate x0 from g(x), keep it with probability f(x0)/[cg(x0)] and delete it otherwise, and repeat. First, for simplicity, suppose the density has a nice closed form, like the triangular density on x in (0,2), f(x) = 1 - |x-1|, Here c = 1, and the support is (0,2), so d = 2, and b = cd = 2. ## Plot the density x = seq(-1,3,length=101) y = rep(0,101) y[(x>0) & (x<2)] = 1 - abs(x[(x>0) & (x<2)] - 1) plot(x,y,type="l",ylab="f(x)",lty=2) To simulate from this density using rejection sampling a) Simulate a draw x0 from g(x) = uniform on (0,2) = 1/2 on (0,2). b) Keep x0 with probability f(x0) / [bg(x0)] . c) Repeat. ## 10000 draws from the triangular f(x) b = 2; g = 0.5 n = 10000 x = c() ## these will contain the results. i = 0 while(i < n){ x0 = runif(1)*2 fx0 = 0.5 * (1 - abs(x0 - 1)) if(runif(1) < fx0 / (b*g)){ ## keep x0 i = i+1 x[i] = x0 if(i/100 == floor(i/100)) cat(i, " ") } } hist(x,nclass=40,prob=T,add=T) What if you want to sample from a more complicated density f(x), such as a kernel smoothing? Again, let b = the maximum of f(x), which you can find by looking at a plot of f(x). The support might not be finite, technically, but you can often find a finite region that effectively contains almost all the support of f(x). Let's use the example from last lecture. x1 = c(1,2.8,3) grid1 = seq(-2,6,length=101) bw1 = bw.nrd(x1) y = density2(x1,grid1,bw1) plot(grid1,y,type="l",xlab="x",ylab="density") Suppose we want to simulate 30,000 draws from this density f(x). The density clearly does not go higher than 0.5 anywhere, and (-2,6) is plenty wide to contain almost all the support of f(x). So, c = 0.5, d = 8, b = cd = 4. The only hard part is that f(x0) now needs to be computed for each x0! b = 4; g = 1/8 n = 30000 x = c() ## these will contain the results. i = 0 while(i < n){ x0 = runif(1)*8 - 2 ## this simulates the uniform g(x) on (-2,6) fx0 = density2(x1,x0,bw1) ## this computes f(x0) if(runif(1) < fx0 / (b*g)){ ## keep x0 i = i+1 x[i] = x0 if(i/100 == floor(i/100)) cat(i, " ") } } hist(x,nclass=40,prob=T,add=T,density=10) 4. Maps. Map of California counties. install.packages("maps") library(maps) map('county', 'California') plot(seq(-118,-116,length=10),seq(33,35,length=10)) map('county', 'California', add=T) ## Hard to make it look good. 5. p252 on. p253, qqnorm(), qqline() x = rt(100,df=10) qqnorm(x) qqline(x) Other quantile plots, pp 254-255. p260, par(mfrow=c(2,2)) we've seen. That's important sometimes. Watch out when changing par(), p265. Read ch. 11. Note that there are functions to calculate the "influence" of each point in regression, p297. You won't use this for hw2.