1. HW3 is on the course website. It is problems 4.6, 4.9, and 4.19. 2. Project order for Wed Jun7 in class. x = c("Tim","Nishath","Yoonho","Andrew","Angel", "Danny") sample(x) n = length(x) y = sample(x) for(i in 1:n){ cat(i,y[i],".\n") } 1 Tim . 2 Angel . 3 Danny . 4 Andrew . 5 Nishath . 6 Yoonho . 3. Note about projects. For your projects, you will find some time series data of your own, and will analyze the data using the methods we have talked about in class, including the methods used for the homeworks and also other methods we have discussed in class. You will write up your analysis in a written report, and will also make an oral presentation. The oral presentations will be 10-15 minutes each in total. You will receive one overall grade based on the oral report and written report, combined. Your dataset, which you will find yourselves, on the web, can be anything you choose, but it should be: a) univariate time series data. You should have at the very least n=100 observations. b) something of genuine interest to you, and where you have more knowledge than an average person. Analyze the data using the methods we have talked about in class. a. Show the data. Separate it into training and testing. b. Estimate the sample acf and pacf of the raw data. Include confidence bands. c. Estimate the spectrum of the raw data using the periodogram and either smoothed periodogram or AR spectrum. Comment on any clear cycles and the overall distribution of the variance by frequency, and its relationship to the smoothness of your time series. d. Remove the trend, either by fitting a line, or some other curve, or doing kernel smoothing, or splines. Comment on the fitted trend. e. Remove obvious cycles. For instance, if you have daily data and want to remove a weekly cycle, calculate the mean on Mon, the mean on Tue, etc., and remove these means from each corresponding datapoint. Plot the trend and cycles you are removing, and show the residuals after removing them. f. Estimate the sample acf and pacf of the residuals. g. Fit ARMAs to the residuals. Select one. Comment on the fitted parameters. h. Estimate the spectrum of the residuals of your ARMA model. i. Use your ARMA model to make forecasts on the testing portion, and evaluate the performance especially via the root mean squared error. j. Make forecasts going beyond your testing data, and speculate about them. Your final project should be submitted to me in pdf by Jun14, 11:59pm, by email to frederic@stat.ucla.edu. They are all due the same date, regardless when your oral presentation is. Final project and oral report tips. For the oral projects. Rule 1: Do not look at me when you are talking. Rule 2: 10-15 minutes per oral report, plus questions at the end. I will cut you off if you go over 15 min. You can have someone in the audience help you with the time. Rule 3: Everyone must be respectful and quiet during other people's talks. You can ask clarifying questions but keep deep questions until the end. Rule 4: Send me a pdf version of your slides by 6pm the night before your talk, to frederic@stat.ucla.edu. That way, I can set up the talks in order ahead of time and we won't have to waste time in class waiting for each person to connect their laptop to the projector. About 8 slides seems right, though it's fine with me if you want fewer or more. Rule 5: Speak very slowly in the beginning. Give us a sense of your data. Assume that the listener knows what the statistical methods you are using are, but knows nothing about the subject matter. Tell us what the methods say about your data. Emphasize the results more than the methods. Make sure to go slowly and clearly in the start so that the listener really understands what your data are. Rule 6: Speculate and generalize but use careful language. Say "It seems" or "appears" rather than "is" when it comes to speculative statements or models. For example, you might say "The data appear to be uncorrelated" or "an AR model seems to fit well" but not "The data are white noise" or "The data come from an AR model". Rule 7: Start with an introduction explaining what your data are, how you got them, and why they are interesting (roughly 2-3 minutes), then show your results as clearly as possible, with figures preferred (roughly 8 minutes), and then conclude (roughly 2 minutes). In your conclusion, feel free to mention the limitations of your analysis and speculate about what might make a future analysis better, if you had infinite time. This might include collecting more data, or getting data on more variables, as well as more sophisticated statistical methods. For your written reports, rules 5-7 apply again. The text of your report should be around 4-5 pages. Have just the text in the beginning, and then the figures afterwards, and include your code at the end. Save it as a pdf and email your pdf document to me, at frederic@stat.ucla.edu . 4. Note about causal and invertible ARMAs. Here are a couple examples. Xt − Xt−1 = Wt − 1/2 Wt−1 − 1/2 Wt−2. Identify the ARMA(p,q) model, and say if it is causal or invertible. The AR polynomial is φ(z) = 1 − z, which has root 1. The MA polynomial is θ(z) = 1 − z/2 − z^2/2, which has roots [-b+/-√(b^2-4ac)]/2a = [1/2 +/- √(1/4 + 2)]/-1 = -1/2 +/- 1.5, or −2 and 1. Since these polynomials share a common root, they have the common factor 1 − z. Factoring these out, the reduced representation has AR polynomial φ(z) = 1, which has no roots, so it is causal, and MA polynomial θ(z) = 1 + z/2, which has root −2, so it is invertible. This is a causal and invertible ARMA(0, 1) process. The reduced process is Xt = Wt + 1/2 Wt-1. Xt - Xt-1 = Wt + 1/2 Wt-1 - Wt-1 - 1/2 Wt-2 = Wt - 1/2 Wt-1 - 1/2 Wt-2. Xt − 2Xt−1 + 2Xt−2 = Wt − 8/9 Wt−1. The AR polynomial is φ(z)=1−2z+2z^2, which has roots [2+/-√(4-8)]/4, or 1/2 ± i/2. These roots are inside the unit circle because 1/2^2 + 1/2^2 = 1/2 < 1. The MA polynomial is θ(z) = 1 − 8z/9, which has root 9/8. So this is an ARMA(2, 1) process which is invertible but not causal, because the AR polynomial has a root inside the unit circle.