(These are real questions asked by other students just like you
--only the names have been changed to protect the innocent
--look here for answers to common questions.)
Student's Question | Professor's Response |
Prof. Cochran, I am trying to avoid having to buy the book and wanted to use the old 1991 edition. What are your thoughts on that? Are the problems in the 1991 version about the same as the 1998 version? How about the overheads book? A Student |
Dear Student, My thought is that you won't know what homework problems to do. The books are very similar. But I haven't checked the two to see where they are the same or different. The homework sections are different. You are responsible for the 3rd edition material. The lectures are not exactly the same for both quarters--it's up to you if saving $7 is worth the hassle. Dr. C. |
Dear Professor, On the homework questions, do we do the Exercise Set questions or Review Questions at the end of each chapter? Please reply to me, or tell us in class tomorrow. Thanks for your help. |
Dear Student, That information is in the syllabus. Dr. C. |
Professor Cochran, Today in lecture you proposed the question, "If you went to Murphy Hall to file a petition would it be better to do it at lunch time or at 9:00 am?" I was wondering if this was the proper way to phrase the question because it is very hard to define "better". For example, a particular student may think that getting an hour of extra sleep or taking their time waking up has more value then waiting in line an additional hour during the day. For this person, wouldn't it be better for them to go during lunch. Isn't this the reason why there is a difference in the length of the line during the day. People are not willing to get up early and get in line. This makes the line longer when they feel that waiting in line has more value than the next best thing they could do with their time. Thank you for your time, |
Dear Student, Yes, you have the point...we have to operationalize the word "better"...automatically most people think better means shorter lines, a point I intended to make in lecture but didn't quite do. So here you have defined other ways of operationaling 'better' that are also just as valid. This is always an important issue in designing instruments. We have to translate our language into precise concepts. Dr. C. |
Page 41, SET C, # 1: Book answer of "15%" appears to be wrong: because: Areas shown in histogram are: (10 x 1) + (20 x 1) + (5 x 1) = 35 total area we know is 100% area in class interval of $200-$500 = 100-35=65 height = 65/3 = 21.67 Am I right? please reply. |
No, unfortunately you are
not right: 1*10 + 1*20 +5*5 = 55 so remaining is 45 to be divided by 3 = 15. The 5 height covers 5 blocks wide..take another look at it.... |
Professor Cochran, Hi! Is the r.m.s. value the same as the standard deviation value? If not, what does the r.m.s. value tell and what is it used for? Thanks. |
Ms. Student, r.m.s. is a mathematical technique (like calculating a mean which is sum the elements whatever they are and divide by the number of elements). r.m.s. is square the elements, sum the squares, divide by the number of elements and take the squareroot. when the elements are the deviations from the mean then you are calculating a standard deviation. The technique of root mean square is used to calculate many things; in this class we will use it for s.d. and also for residuals in regression. Dr. C. |
Hi Professor, I have a question on #12 of chapter 2. I don't know exactly how to explain the question they are asking for. Is it that these two findings should use different wordings when asking their question? Maybe it's not just exercising that cause spotanteous birth, but it's "rigourous" exercising that associates with spontaneous birth? I am not sure what they're looking for in this question, so could you please clear up this question for me. Thanks, |
Mr. , Think about it this way...In an observational study, people assign themselves to conditions. So it's the women who choose to exercise or not, right? So maybe 1> healthier women in general are more likely to exercise (and they would have a lower rate of spontaneous abortions anyways). Or, maybe exercise acutely is associated with spontaneous abortion, but experienced, chronic exercisers (regular exercisers) don't have the same physical response to exercise than those who rarely exercise. So I can see at least two reasons why there might be observed relationship. There might be others too. That's one of the limitations of an observational study. What the author is looking for here is for you to start to develop thoughtfulness about findings you read. The ability to generate hypotheses to explain observed associations between variables. Dr. C. |
Dear Dr. Cochran, I had a few last minute questions. 1. Is it imperative that I know how to do S.D. on my calculator? I have not been able to figure it out on a TI-82. Would you know how? Can i get along for the midterm without knowing how to do this? 2. What is the diffterence between a z-distribution and a normal curve? I thought they were the same but why are they listed as two different things on the list of things to study? 3. What is the difference between spread and distribution? I would appreciate it if you would answer my questions as soon as possible. Thank you. |
1. You need to be able to
calculate a SD...people did this long before calculators
were made that did it automatically... 2. z is distributed as a normal curve, with a mean of 0 and a SD of 1...the normal curve is a probability density function...the distinctions between the two await you in a more advanced course...for now you can think of them as the same--the list of topics is conceptual as in know what the normal distribution is and know what z is and how we use it 3. distribution is the list of elements in our set (whether that is a population or a sample)...spread is the variability of these elements from each other |
Prof. Cochran, I'm having trouble understanding the SD and calculating it. I don't understand how, where, and when it is used. I've been trying to do the sample problems in the book but when I check my answers they are wrong. What is the difference between a mean and median? Thank You, |
Ms. Student, Sounds like you are lost and in need of more than an email. Try to get in to an office hour on Monday, or go to any discussion group. Backtrack through the questions you have tried. Take an example in the book and go backwards from the answer to see the calculation. knowing when an sd is used...it's a statement about the average spread away from the mean in a distribution. So, if you have an SD of 2 then you can say that on average about 68% of values are plus or minus 2 away from the mean. If the mean is 8, then about 2/3's (68% more or less) of the values will lie between 6 and 10. How can I say 68%? I got that from the normal distribution (bell shaped curve) a mean is the sum of all the values divided by the number of values that contributed to the sum. a median is the value, below which 50% of the values in the sample lie. But, come to office hours. Work with either a TA or myself to explain how you see it so that we can figure out where you are taking the wrong turn. Ok? Dr. C. |
Dear Professor Cochran, I'm in your stats 10 class and I'm having trouble getting onto the web page. I've gotten on it before, but it doesn't seem to be connecting. Can you please check it to make sure it's working? Thanks so much. |
It's still there but the
server (the computer the web page is stored on) may have
been down (not working). try again. I don't have control
over whether it works or not. good luck. Dr. C. |
Dear Prof Cochran, 	When reading through your lecture outlines on the net, I did not understand the following two points you made in lecture #3 2. When a variable has an underlying continuous distribution, we can step down in the hierarchy and treat the values we measure as discrete, but we can't go the other way 3. The scaling of any variable can always be rescaled to a lower level in the hierarchy of scaling. That is, ratio scaled variables can be measured intervally, ordinally, or nominally. But we cannot go the other direction. Please explain Sincerely, |
Mr. , think of this variable: the number of calories you ate for dinner. It is continuous, has a true 0, and so is ratio scaled. We can reasonably calculate the ratio between two values--100 calories is twice (100/50) the number of calories as 50. But maybe we wanted to measure it less precisely--using an interval scale where we assign values to set of ranges 1--------------2---------------3 101-300--301-500---501-700 you ate 420 calories so we code you a "2" the top (1 2 3) is an interval scale. The intervals are the same size but we can no longer form a ratio that makes sense. Or maybe we wanted to be more crude in our measurement: 1------------2-----------3 a little--some------alot we can see order here but don't know how wide the intervals are now, I went from ratio down the hierarchy--you give it a try going the other way--let's start with how many miles you traveled last year and start with ordinal scaling a little....some..... alot let's say you answer "some" --where would you put that on an ratio scale (number of miles)? see if i knew your miles i could recode it less precisely but I just can't go the other way because the information is not there. see ya in class Dr. C. |
Hello Professor, This is a student from your stats 10 course. I was the one who said that you couldn't find your intelligence from the SAT because it was impossible to find the SAT Standard Deviation. I still don't understand-how can you find the SD? In order to do so, you would need to know everyone's scores. Will the College Board release anyone's score upon request? I don't believe so...So wouldn't it be impossible to find the SD unless you knew others' scores? |
Mr. , Well...generally you don't 'find' an sd in a known distribution...it's given (if you have a problem to solve or if you are doing research) ..like we know that IQ has a mean of 100 and an sd of 16. SAT's originally (it's not true now) have a mean of 500 and an sd of 100. scores can range from 200-800 (that's plus or minus 3 sd on either side which as you can see in your table included 99.7% of all possible scores theoretically). If we take those numbers as real, and let's say you got a 600 on your SAT. That's grossly one SD up. So if we translate to IQ then your IQ would be 100+16 or 116. Except that for reasons we said in class this is an underestimate (it's biased down) because smarter people tend to take the SAT. You're right. if it's not given then you have to calculate it and to calculate it you need to know all the scores. But you can figure it out often times. The blurb they sent you probably told you what the mean score was (or you can find it in the newspaper sometimes). You have your percentile. z's involve 4 concepts (your score, your percentile which you can translate into a z-value, the mean, and the sd). so since you have 3 that leaves one unknown (the sd) and you can solve for that. Dr. C. |
dear dr. cochran, i was wondering if there are answer keys to the unassigned review problems because i wanted to do extra problems as practice for the midterm. thank you, |
Ms. , Unfortunately there are not. But if you have trouble doing one and think your answer might be wrong just ask your TA or myself during office hours to take a look at your work. Or another strategy is to ask your TA to go through the problem with the class in discussion. You'll see also that my prior final questions in the overheads book have no correct answers given either--there's actually method to the madness. Sometimes knowing the answer prevents the struggle to figure out the answer. But both the TA's and myself are here to help. Dr. C. |
Dr. Cochran, I was wondering if we had to "memorize" in depth, the long definitions of qualitative variables (nominal, categorical, etc.) and quantitative variables or is basic knowlegde of the difference between the two enough? |
Mr. Student, You are responsible for everything that was presented in class and in the book. You don't need to memorize the formulas because those will be given to you on a cheat sheet similar to what is in the overheads book. Are you really asking a professor to tell you whether you have to learn something well or will superficial be enough? Hmmm, let me think about that... Dr. C. |
Dear Professor Cochran: I am having a great deal of difficulty with question #11 in the SR exercises in Chapter 6. I can not figure out how you determined that the 75th percentile is about .67 SDs above average and that the 90th percentile is 1.28 SDs above average. Your help would be greatly appreciated. Thank you. |
Ms. Student, Here's the trick. Think about what you know. 25th percentile is 62.2 and 75th percentile is 65.8. Think about what you want to know--what are the inches associated with the 90th percentile. Think about what you need to say what you want to know--> well, to match inches to a percentile it would help to know the mean and sd of the distribution because then you can use the normal table to find the z value of the percentile you want. Now you're ready. What is the mean? well, if you assume that the distribution is normal then the mean=median and the median is the 50th percentile which is halfway between the 25th and 75th. So the mean would be....(65.8 - 62.2)/2 = 1.8 plus the value for the 25th percentile or 62.2 + 1.8 = 64 inches. What is the SD? this is a little harder to figure, but you now know the mean and two other values where you know the percentiles. Pick one and start: z= (X-mean)/SD, so for the 25th percentile (Which in the table cuts off 50% of the distribution in the center) is about 0.67. so -.67 = (62.2 - 64.0)/SD (REMEMBER the z for percentiles under 50 is negative!) solve for this and you now have both the mean and SD for the distribution. Now, what is the 90th percentile? Take the bell curve. Draw a picture of it. To the left of the mean or median or midpoint is 50%, right? to get to the 90th percentile, you have to go an additional 40% of the distribution to the right. The table in the book can give this to you, but you have to remember that area described in the table is the center of the distribution going both left and right of the mean. So to go 40% to the right you have to also take the 40% to the left and look at the z value for 80%, or 1.28, there about. Now you have all the pieces to assemble the pie: 1.28 = (X???? - 64)/2.7 = 67.4 Now, to really learn this change the question a bit and do it again. Pick a different percentile to look for. Start with knowing the value of the 20th and 70th percentiles....That'll cement it in yer brain. Good luck studying, Dr. C. |
Hi Professor, Hope that you had a wonderful weekend. I have a question regarding chapter 4, #11. I don't quite understand what they are trying to aim at. IS the loss have to do with problability, which is something that we haven't gone over. There are also two SD's and 2 avg.'s that were given. Which one do we use and how? Thank you professor. |
Thanks. Hope you had a
good one yourself. If you look at question 10, the idea
is related to a single observation using the normal
curve. So you would expect about a third of the time to
be more than 8 points away from what the student actually
scored by choosing the mean. (Can you see that?...that
the mean is equivalent to a z of 0, and 8 points (or one
sd) is equivalent to a z of -1 or +1 on either side.
There's one score but you have a 1 in 3 chance of being
off by more 1 SD in either direction, because you choose
once, it'll be either higher or lower with about 16% in
each tail of the z distribution) Well, question 11 builds
on that. Your loss for each student in a series of
students is a deviation (as in: the student score - your
guess of the mean, or X - mean of X). So if you use as a
best guess for every student in the series the mean, then
about 1 in 3 times your guess will be off by more than 1
SD (in this instance an SD is the same as a r.m.s of your
losses because the standard deviation is by
definition the r.m.s. of the deviations), so another way
of saying it is that the r.m.s. of your losses will be
around $8. Now that's not the same as saying, the total
amount you will lose, or on average at the end of doing
this you will have lost $8 cumulatively. Rather, on
average each time you guess the r.m.s. size of your loss
will be around $8. (I know I'm repeating myself here
trying to find the way that will make sense to you--the
r.m.s. or sd is a way of saying, gee, around how much
will I be wrong in my guess on average and that's
generally set at plus or minus a single s.d., because we
know that plus or minus a single s.d. captures about 68%
of the distribution, so people aren't much more precise
than that--like if I told you to buy cokes for a party
and I said there are 100 people coming and people drink
around 2 cokes each + or - 1, it'll give you a feel for
how many to buy...a 100 people 2/3's of whom drink 1-3
cokes each, so, I don't know, buy 200 but a lot of people
won't be happy, buy 300 and you'll come close to
satisfying everyone, those who drink none will balance
who drink more mostly, not perfectly). See ya in class
tomorrow. Dr. C. |
On problem 7 I don't
understand how to find the new standard deviation. Please
explain. I found the average 630 by doing the following
calculations, are they correct?
I also had a problem with number 11 in finding the standard deviation. I read the answer that said the 75th percentile is .67 SDs away but how did you get that? Thanks so much,
|
650(600) + 600(400) /
1000 = 630 absolutely for the mean. now for the sd. first draw two normal distributions (bell curves) for men and for women, centering the men on 650 and the women on 600. Notice how if you think about putting two curves together it has wider spread than each of the 2 curves individually? The spread is related to the sd, so if it's wider, it's gotta be bigger (covers more distance so the number is larger) so what do you know 1> the value associated with the 25th and 75th percentile exactly. 2> the curve is normal. you could figure the mean, but how do you work with the 75th percentile. first draw a normal curve like in the book's normal table. label the 25th and 75th percentile. Black ink the space between the two. Now that matches the table in the book. By definition, the 75th percentile cuts off the distribution in a way where 25% are higher or in that right tail's white space. Because the curve is symmetric, that means 25% of the distribution is also in the left tail white space. if 50% (25% + 25%) are in the tails, that means the blackened space contains an area of 50%. In the normal table, an area of 50% corresponds to an sd of about .67(1st column in the third block of rows shown). yer welcome Dr. C. |
Hello Professor Cochran,
I have a question concerning the Special Review exercise.
Problem #3: I understand how the average and the blanks
were determined, however, for the last blank of the first
row, I do > understand that using z=-1.4, SD=15, and
Avg.=52 would give the answer:31, but if that is the
case, I don't understand how 31 added to the rest of the
values and divided by 5 (to determine the Average) would
not give 52 as an average but 59.6. If 31 is part of the
data then shouldn't the average be the same in either way
we compute it? Thank you for your time. |
Ms. Student, Ah, but the 5 scores are not all that is used to calculate the mean, right? There's a group of scores, of which 31 is one element and the other four scores given are 4 of the elements. The leap that the book is asking you to make is to go from being given all information right in front of you to using a distribution where you only some piece of it and are still trying to come to a decision without having to see all of it. Good luck with your studying. Dr. C. |
Prof. Cochran I am sorry to be e mailing you so late but I discovered another question during my studies for the midterm. It concerns the last problem on the practice problem page in the overhead book ( the one w/ the histogram drawn). To find an ave in that problem all you have to do is find the percent for each class interval and then take the ave from the percent, is the correct? Thanks for you time |
Ms. Student, I don't know if I quite understand what you are trying to do. But, giving it a guess...the histogram shows a gross, inexact view of the distribution and I guess you could convert using the midpoint of each rectangle into some version of what the sum is in the numerator. You know the denominator and so you could give it a crude guess of sort of where the mean would be....but I hope you are not really talking about means here. With histograms you assume that the rectangle has a uniform density (though obviously it really doesn't) and so if it is ten units across the bottom and 3 high representing 30% and you want only half of that then you would draw the line at 5 (halfway out). Good luck tomorrow. Dr. C. |
Dear Professor Cochran, You talked about finding jobs as data analysists after we graduate. What should I do to prepare for work in that area? Also, how can I find out where job openings are? The work is pretty easy right now. The only question that I really had was why the root-mean square was important and what it meant in relation to the standard deviation, but you answered that in lecture. Thanks for giving interesting lectures; it makes understanding the material a lot easier. Sincerely, |
Mr. Student, You asked how do you get experience at data analysis. UCLA students have several options. You can do 199, SRP, or 194. In each of these you work one-on-one with faculty or graduate students learning research skills. Here, you would want to do it with someone who does a lot of data processing, so it might be someone in poli sci, econ, psychology, sociology, public health, public policy...areas like that. There is an SRP office in Murphy that lists faculty and their interests and the office staff can explain to you what is involved in doing SRP. You can also acquire some of the skills through coursework...like an upper division stat class or some classes in the social and life sciences. Another path is to work as a 'clerk' for faculty doing research (take a job)...those opportunities you would find through the placement center. Anyway, if you want to do it, the path is there. Dr. C. |
I'm not sure if you're the person to ask, but here goes---I'm getting tripped up on problem 9 of chapter 13. please help | Here is the strategy:
1> First ask am I doing something once or more than
once, 2> if doing it more than once is it with or
without replacement, 3> if more than once are they
independent, 4> the prior two answers have implication
for how you figure out the multiplication of the
probabilities, 5> what are the total possible outcomes
in the first time, 6> of these total possible
outcomes, what meets my criterion--that is what satisfies
the outcome i am desiring to predict. So, answer me this... what are the answers to the 6 strategies above? Remember that the P(1st time and 2nd time and 3rd time)=P(1st time)*P(2nd time given first time)*P(3rd time given first and second time) Dr. C. |
Prof Cochran, Just wanted to know how to calculate the SD using a scientific calculator...I have not been able to pull that one off. I am guessing (and hoping) that I won't really need a calculator for the first midterm...is that an accurate assessment? Thanks for your help, |
Dear Mr. Student, It's easy enough to find a calculator that does SD (it will say on it 's' or the Greek sign for sigma...these calculators also will store X and give you the mean of X, at a minumum. My accurate assessment? It's up to you--if you've gotten through the homework just fine without it, then what's the deal...but remember, I don't take the test (I probably would use the calculator to do it both ways, automatic and manually, if I were taking the test, but then that's just me), you do. The other thing to think about is here is the chance for you to learn a bit of technology (like how to use a calculator to automate some tasks for you) and that's what going to school is for--are you really asking a Prof to tell you it's ok not to learn something? You really expect me to say "Yes"? :) hey, do well on the exam, Dr. C. |
Hi. I have a question on standard deviation. I did it with my calculator, but the number is bigger than you had in your reader because the calculator doesn't round off. For example, 35/6 is 5.83 in your reader, but my calculator saves all the digits as 5.833333333, and I get standard deviation right from that. I think this is more accurate. I can't do it separately with calculator other than plug into long formula to get rounded off number. Is this going to ok on the exam to have not rounded mean and get standard deviation from it? Thank you for your time. | Dear Student, On the exam you'll round to two digits. I think it won't matter. Dr. C. |
Hi, Professor Cochran.
Maybe you think that this is a little late asking you
questions, but I really don't understand something.
First, on problem 8 of p. 53 of the book, why is it true
for part a and false for part b and c? i thought that
part a should be false and part b should be true because
the percentages should be similar if the income is spread
fairly evenly, right? Does spread evenly mean very
similar? Also, isn't a histogram a graph that has area of the blocks representing percentages? Isn't the blocks representing percentages in the graph? My next question is that on p. 65 #1c and p. 75 6a, the graph on p. 75 looks exactly like the graph on p. 65. Why is the average 40 on p. 65 and 60 on p. 75? Also, do we only need to bring a calculator, a pen, and ID tomorrow? Do we need to bring blue book or scantron? Please respond as soon as possible. Thank you. |
Whew...your email drips
anxiety...slow down..take a breath. now...I know the
midterm is tomorrow, but what percentages?--the labels
(of the area) or the height if the interval on the bottom
is the same size interval for all values (look see...is
the bottom of the histogram set up correctly??) check it
out...that's why it's not a histogram hmmm... 1c is 40, 6 iii is 40 ...see...i think you're panicking. so stop. look out the window. pause. then go back to studying no blue book, no scantron, just you, your id, a writing instrument, a hand calculator, and clear mind. Good Luck Dr.C. |
Professor Cochran, thank
you for responding to my message. I was in a panic, but
now, I'm fine. I asked you about the average on p. 65 and
p. 75 because the book does not have the same answer for
both problem. You have probably missed seeing my first
question because I have clumped so many questions
together. On problem 8 of p. 53 of the book, why is it
true for part a and false for part b? I thought that part
a should be false and part b should be true because the
percentages should be similar if the income is spread fairly evenly, right? Doesn't spread evenly mean very similar? |
Yes, and the key to getting the problem is in the details...look at the ranges in the horizontal axis...are they the same? |
Dr. Cochran, I'm sorry for asking you another question. And, I'm sorry this is so late. But, since we don't have the answers for the review problems in the overhead book, can I ask how I go about doing #5? The find the 75th percentile one? Thanks again, |
Well, Ms. Student, it's a
rare prof that's up at 10pm the night before students
take an exam who is on the computer. But here we are at
7:30 am. I assume you figured it and found the answers
were posted on the web. If not, find how that histogram
is scaled. Is it density or is a count (you can't tell
from the left axis). To do so find the total area. If
it's not 100 then those are counts. Take 75% of the
count, and then go out to the point on the axis where 75%
of the area is to the left. Good luck. Dr. C. |