{\rtf1\mac\ansicpg10000\cocoartf102 {\fonttbl\f0\fswiss\fcharset77 Helvetica;} {\colortbl;\red255\green255\blue255;} \margl1440\margr1440\vieww9000\viewh8400\viewkind0 \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural \f0\fs24 \cf0 1. The midterm is Wednesday, May 12, 3:00pm to 3:50pm, in Boelter 2444.\ 2. HW due Fri May 7 BY EMAIL to the TA, seanwang@ucla.edu, is problem 4, parts a and b, from the handout.\ 3. GLM.\ \ 3. GLM.\ Nelder and Wedderburn (1972) figured out that a lot of models fit into a general form. \ Exponential family. f(y | theta , phi) = exp[(y theta - b(theta))/a(phi) + c(y,phi)], where theta is one of the parameters, the center or mean, or a transformation of it, and phi is the other parameter, the dispersion or variance, or a transformation of it.\ \ Examples. Normal. f(y) = 1/(sqrt(2pi sigma^2) exp[-(y-mu)^2/2 sigma^2]. The two parameters are sigma and mu. mu is the mean, so theta is going to be mu or a transformation of mu, like log(mu) or mu^2 or something. sigma is the sd, so phi is going to be some function of sigma. f(y) = exp(log(1/(sqrt(2pi sigma^2) - y^2/(2 sigma^2) + 2ymu(2 sigma^2) - mu^2/(2 sigma^2 ) = exp(-1/2 log(2 pi sigma^2) - y^2/(2 sigma^2) + ymu/sigma^2 - mu^2/(2sigma^2). Let theta=mu. Let phi = sigma. Let a(phi)=sigma^2. Now we have exp[(y theta/a(phi)]. b(theta) cannot depend on y or sigma. b(theta) = mu^2/2 = theta^2/2. c(y,phi) = -1/2log(2pisigma^2) - y^2/(2sigma^2) = -1/2log(2pi phi^2) - y^2/(2phi^2). You could also have phi=sigma^2, as the Faraway book does on p115. \ \ Poisson. f(y) = e^(-mu) mu^y/y! = e^(-mu + log(mu^y) - log(y!)) = e^(-mu + ylog(mu) - log(y!)). Note the y log(mu) term. Let theta = log(mu). Let a(phi) = 1. Then we have f(y) = e^((ytheta-b(theta)/a(phi)+c(y,phi)), where b(theta) = mu = exp(theta). c(y,phi) = -log(y!). So it works out without phi. This is because the mean, mu, specifies the Poisson distribution. We can just let phi = 1.\ \ Uniform? Let theta = center and phi = spread to one side. f(y) = 1/(2 phi) 1(|y-theta|<=phi). No way.\ \ Try it yourself for the binomial or Gamma distribution. We will discuss these next class.\ \ For a member of the exponential family, EY = mu = b'(theta), and V(Y) = b''(theta)a(phi). b'' = variance function.\ The link function, g, has eta = g(mu). eta = x^T beta.\ The canonical link is g such that eta = g(mu) = theta. That is, g(b'(theta)) = theta. For the normal, for instance, b(theta) = theta^2/2, so b'(theta) = theta. So, g is the identity. \ For the Poisson, b(theta) = exp(theta), so b'(theta) = exp(theta), so if g(b'(theta)) = theta, then g is the log function. So the canonical link is the log link.\ pp.120-121 of Faraway describe general formulas for deviance for GLMs, and residuals are described on p123. Pearson residuals are r_p = (y-mu^)/sqrt(V(mu^)), where V(mu) = b''(theta). \'b7r_p^2 = chi^2, and this is usually asymptotically chi^2-distributed with p degrees of freedom, as n -> \'b0.\ On p126, Faraway suggests plotting X=eta^ against Y=Pearson or deviance residuals. You should look for nonlinearity and heteroskedasticity. You can also look at partial residuals, see p128-129, and look for outliers, see p129. \ }