Student Question:

> Prof. Cochran,

> I am trying to avoid having to buy the book and wanted to use the old

> 1991 edition. What are your thoughts on that? Are the problems in the

> 1991 version about the same as the 1998 version?

>

> A Student

>

Professor Response:

Dear ,

My thought is that you won't know what homework problems to do. The books

are very similar. But I haven't checked the two to see where they are the

same or different. the homework sections are different. You are

responsible for the 3rd edition material.

Dr. C.

 

Student Question:

> Professor Cochran,

> Today in lecture you proposed the question, "If you went to Murphy Hall to

> file a petition would it be better to do it at lunch time or at 9:00 am?" I

> was wondering if this was the proper way to phrase the question because it

> is very hard to define "better". For example, a particular student may

> think that getting an hour of extra sleep or taking their time waking up

> has more value then waiting in line an additional hour during the day. For

> this person, wouldn't it be better for them to go during lunch. Isn't this

> the reason why there is a difference in the length of the line during the

> day. People are not willing to get up early and get in line. This makes the

> line longer when they feel that waiting in line has more value than the

> next best thing they could do with their time.

> Thank you for your time,

Professor Response:

Yes, you have the point...we have to operationalize the word "better"...automatically most people think better means shorter lines, a point I intended to make in lecture but didn't quite do. So here you have defined other ways of operationaling 'better' that are also just as valid. This is always an important issue in designing instruments. We have to translate our language into precise concepts.

Dr. C.

 

Student Question:

> Page 39, SET C, # 1:

>

> Book answer of "15%" appears to be wrong: because:

> Areas shown in histogram are: (10 x 1) + (20 x 1) + (5 x 1) = 35

> total area we know is 100%

> area in class interval of $200-$500 = 100-35=65

> height = 65/3 = 21.67

> Am I right? please reply.

>

Professor Response:

No, unfortunately you are not right: 1*10 + 1*20 +5*5 = 55 so remaining is 45 to be divided by 3 = 15. The 5 height covers 5 blocks wide..take another look at it....

 

Student Question:

> Professor Cochran,

> Hi! Is the r.m.s. value the same as the standard deviation

> value? If not, what does the r.m.s. value tell and what is it used for?

> Thanks.

Professor Response:

Ms. ,

r.m.s. is a mathematical technique (like calculating a mean which is sum the elements whatever they are and divide by the number of elements). r.m.s. is square the elements, sum the squares, divide by the number of elements and take the squareroot. when the elements are the deviations from the mean then you are calculating a standard deviation. The technique of root mean square is used to calculate many things; in this class we will use it for s.d. and also for residuals in regression.

Dr. C.

 

Student Question:

> Dear Dr. Cochran,

> I had a few last minute questions.

>

> 1. Is it imperative that I know how to do S.D. on my calculator? I have

> not been able to figure it out on a TI-82. Would you know how? Can i get

> along for the midterm without knowing how to do this?

> 2. What is the diffterence between a z-distribution and a normal curve? I

> thought they were the same but why are they listed as two different things

> on the list of things to study?

> > 3. What is

the difference between spread and distribution? >

> I would appreciate it if you would answer my questions as soon as possible.

> Thank you.

>

> Sincerely,

Professor Response:

1. You need to be able to calculate a SD...people did this long before

calculators were made that did it automatically...

2. z is distributed as a normal curve, with a mean of 0 and a SD of 1...the normal curve is a probability density function...the distinctions between the two await you in a more advanced course...for now you can think of them as the same--the list of topics is conceptual as in know what the normal distribution is and know what z is and how we use it

3. distribution is the list of elements in our set (whether that is a

population or a sample)...spread is the variability of these elements from each other

 

Student Question:

> Prof. Cochran,

> I'm having trouble understanding the SD and calculating it. I don't

> understand how, where, and when it is used. I've been trying to do the

> sample problems in the book but when I check my answers they are wrong.

> What is the difference between a mean and median?

>

> Thank You,

Professor Response:

Ms. Student,

Sounds like you are lost and in need of more than an email. Try to get in

to an office hour on Monday, or go to any discussion group.

Backtrack through the questions you have tried. Take an example in the

book and go backwards from the answer to see the calculation. knowing

when an sd is used...it's a statement about the average spread away from

the mean in a distribution. So, if you have an SD of 2 then you can say

that on average about 68% of values are plus or minus 2 away from the

mean. If the mean is 8, then about 2/3's (68% more or less) of the values

will lie between 6 and 10. How can I say 68%? I got that from the normal

distribution (bell shaped curve)

a mean is the sum of all the values divided by the number of values that

contributed to the sum.

a median is the value, below which 50% of the values in the sample lie.

But, come to office hours. Work with either a TA or myself to explain how

you see it so that we can figure out where you are taking the wrong turn.

Ok?

Dr. C.

 

Student Question:

> Dear Professor Cochran,

>

> I'm in your stats 50 class and I'm having trouble getting onto the web page.

> I've gotten on it before, but it doesn't seem to be connecting. Can you

> please check it to make sure it's working? Thanks so much.

>

Professor Response:

It's still there but the server (the computer the web page is stored on) may have been down (not working). try again.

good luck.

Dr. C.

 

Student Question:

Dear Prof Cochran,

When reading through your lecture outlines on the net, I did not

understand the following two points you made in lecture #3

2. When a variable has an underlying continuous distribution, we can

step down in the hierarchy and treat the values we measure as discrete,

but we can't go the other way

3. The scaling of any variable can always be rescaled to a lower level

in the hierarchy of scaling. That is, ratio scaled variables can be

measured intervally, ordinally, or nominally. But we cannot go the other

direction.

Please explain

Sincerely,

Professor Response:

Mr. ,

think of this variable: the number of calories you ate for dinner. It is continuous, has a true 0, and so is ratio scaled. We can reasonably

calculate the ratio between two values--100 calories is twice (100/50) the number of calories as 50.

But maybe we wanted to measure it less precisely--using an interval scale where we assign values to set of ranges

1---------2---------3

101-300--301-500---501-700

you ate 420 calories so we code you a "2"

the top (1 2 3) is an interval scale. The intervals are the same size but we can no longer form a ratio that makes sense.

Or maybe we wanted to be more crude in our measurement:

1---------2--------3

a little--some------alot

we can see order here but don't know how wide the intervals are now, I went from ratio down the hierarchy--you give it a try going the other way--let's start with how many miles you traveled last year and start with ordinal scaling

a little some alot

let's say you answer "some" --where would you put that on an ratio scale (number of miles)? see if i knew your miles i could recode it less precisely but I just can't go the other way because the information is not there.

see ya in class

Dr. C.

 

Student Question:

> Hello Professor,

>

> This is a student from your stats 50 course. I was the one who said that

> you couldn't find your intelligence from the SAT because it was impossible to find

> the SAT Standard Deviation. I still don't understand-how can you find the

> SD? In order to do so, you would need to know everyone's scores. Will

> the College Board release anyone's score upon request? I don't believe

> so...So wouldn't it be impossible to find the SD unless you knew others'

> scores?

>

Professor Response:

Mr. ,

Well...generally you don't 'find' an sd in a known distribution...it's

given..like we know that IQ has a mean of 100 and an sd of 16. SAT's

originally (it's not true now) have a mean of 500 and an sd of 100.

scores can range from 200-800 (that's plus or minus 3 sd on either side

which as you can see in your table included 99.7% of all possible scores

theoretically). If we take those numbers as real, and let's say you got a

600 on your SAT. That's grossly one SD up. So if we translate to IQ then your IQ

would be 100+16 or 116. Except that for reasons we said in class this is

an underestimate (it's biased down) because smarter people tend to take

the SAT.

You're right. if it's not given then you have to calculate it and to

calculate it you need to know all the scores. But you _can_ figure it out

often times. The blurb they sent you probably told you what the mean score

was (or you can find it in the newspaper sometimes). You have your

percentile. z's involve 4 concepts (your score, your percentile which you

can translate into a z-value, the mean, and the sd). so since you have 3

that leaves one unknown (the sd) and you can solve for that.

Dr. C.

 

Student Question:

> dear dr. cochran,

>

> i was wondering if there are answer keys to the unassigned review problems

> because i wanted to do extra problems as practice for the midterm.

>

> thank you,

>

Professor Response:

Ms. ,

Unfortunately there are not. But if you have trouble doing one and think

your answer might be wrong just ask your TA or myself during office hours

to take a look at your work. Or another strategy is to ask your TA to go

through the problem with the class in discussion. You'll see also that

my sample midterm questions have no correct answers given either--there's

actually method to the madness. Sometimes knowing the answer prevents the

struggle to figure out the answer. But both the TA's and myself are here

to help.

Dr. C.

Student Question:

> Dr. Cochran,

>

> I was wondering if we had to "memorize" in depth, the long definitions

> of qualitative variables (nominal, categorical, etc.)and quantitative

> variables or is basic knowlegde of the difference between the two

> enough?

 

Professor Response:

Mr. ,

You are responsible for everything that was presented in class and in the book. You don't need to memorize the formulas because those will be given to you on a cheat sheet similar to what is in the overheads book.

Dr. C.

 

Student Question:

> Dear Professor Cochran:

> I am having a great deal of difficulty with question #11 in the SR

> exercises in Chapter 6. I can not figure out how you determined that

> the 75th percentile is about .67 SDs above average and that the 90th

> percentile is 1.28 SDs above average.

> Your help would be greatly appreciated.

> Thank you.

>

>

>

Professor Response:

Ms. ,

Here's the trick. Think about what you know. 25th percentile is 62.2 and

75th percentile is 65.8. Think about what you want to know--what are the

inches associated with the 90th percentile. Think about what you need to

say what you want to know--> well, to match inches to a percentile it

would help to know the mean and sd of the distribution because then you

can use the normal table to find the z value of the percentile you want.

Now you're ready.

What is the mean? well, if you assume that the distribution is normal

then the mean=median and the median is the 50th percentile which is

halfway between the 25th and 75th. So the mean would be..

(65.8 - 62.2)/2 = 1.8 plus the value for the 25th percentile or

62.2 + 1.8 = 64 inches.

What is the SD? this is a little harder to figure, but you now know the

mean and two other values where you know the percentiles. Pick one and

start:

z= (X-mean)/SD, so for the 25th percentile (Which in the table cuts

off 50% of the distribution in the center) is about 0.67.

so -.67 = (62.2 - 64.0)/SD

(REMEMBER the z for percentiles under 50 is negative!) solve for this

and you now have both the mean and SD for the distribution.

Now, what is the 90th percentile? Take the bell curve. Draw a picture of

it. To the left of the mean or median or midpoint is 50%, right? to get

to the 90th percentile, you have to go an additional 40% of the

distribution to the right. The table in the book can give this to you,

but you have to remember that area described in the table is the center of

the distribution going both left and right of the mean. So to go 40% to

the right you have to also take the 40% to the left and look at the z

value for 80%, or 1.28, there about.

Now you have all the pieces to assemble the pie:

1.28 = (X???? - 64)/2,7 = 67.4

 

Now, to really learn this change the question a bit and do it again. Pick

a different percentile to look for. Start with knowing the value of the

20th and 70th percentiles....That'll cement it in yer brain.

Good luck studying,

Dr. C.

 

Student Question:

> Prof. Cochran

> I am sorry to be e mailing you so late but I discovered another question

> during my studies for the midterm. It concerns the last problem on the

> practice problem page in the overhead book ( the one w/ the histogram

> drawn). To find an ave in that problem all you have to do is find the

> percent for each class interval and then take the ave from the percent, is

> the correct?

>

> Thanks for you time

>

Professor Response:

Ms. ,

I don't know if I quite understand what you are trying to do. But, giving

it a guess...the histogram shows a gross, inexact view of the distribution

and I guess you could convert using the midpoint of each rectangle into

some version of what the sum is in the numerator. You know the

denominator and so you could give it a crude guess of sort of where the

mean would be....but I hope you are not really talking about means here.

With histograms you assume that the rectangle has a uniform density

(though obviously it really doesn't) and so if it is ten units across the

bottom and 3 high representing 30% and you want only half of that then you

would draw the line at 5 (halfway out).

Good luck tomorrow.

Dr. C.

Student Question:

Dear Professor Cochran,

You talked about finding jobs as data analysists after we

graduate. What should I do to prepare for work in that area? Also, how

can I find out where job openings are? The work is pretty easy right now. The only question that I really had was why the root-mean square was important and what it meant in relation to the standard deviation, but you answered that in lecture. Thanks for giving interesting lectures; it makes understanding the material a lot easier.

Sincerely,

Professor Response:

Mr. ,

You asked how do you get experience at data analysis. UCLA students

have several options. You can do 199, SRP, or 194. In each of these you work one-on-one with faculty or graduate students learning research

skills. Here, you would want to do it with someone who does a lot of data processing, so it might be someone in poli sci, econ, psychology,

sociology, public health, public policy...areas like that. There is an

SRP office in Murphy that lists faculty and their interests and the office staff can explain to you what is involved in doing SRP. You can also acquire some of the skills through coursework...like an upper division stat class or some classes in the social and life sciences. Another path is to work as a 'clerk' for faculty doing research (take a job)...those opportunities you would find through the placement center.

Anyway, if you want to do it, the path is there.

Dr. C.

 

Student Question:

> I'm not sure if you're the person to ask, but here goes---I'm geetign

> tripped up on problem 9 of chapter 13. please help

Professor Response:

Here is the strategy: 1> First ask am I doing something once or more than once, 2> if doing it more than once is it with or without replacement, 3> if more than once are they independent, 4> the prior two answers have implication for how you figure out the multiplication of the probabilities, 5> what are the total possible outcomes in the first time, 6> of these total possible outcomes, what meets my criterion--that is what satisfies the outcome i am desiring to predict.

So, answer me this...

what are the answers to the 6 strategies above?

Remember that the P(1st time and 2nd time and 3rd time)=P(1st time)*

P(2nd time given first time)*P(3rd time given first and second time)

Dr. C.