LECT7

CHANCE/PROBABILITY

Chance or the occurrence of random phenomena

We are now going to deal more formally with probability theory. It is not as unfamiliar to you as you might think. Everyday you use probability theory to make decisions. For example, let us suppose that for various reasons you are very sleepy in class today and want desperately to catch some shuteye. You seriously consider sleeping. One factor in your decision making is whether or not I might notice and embarrass you in front of the class by asking the person next to you to give you a soft poke. So, in terms of possible outcomes from sleeping in class, this would be:

event A: I notice you sleeping

event B: I don't notice you sleeping

There is uncertainty, because both of the possible outcomes could be true, though one or the other. If you think A unlikely, and therefore B very likely, you might go to sleep. Or another way of putting it is that you attach some probability of occurrence to a possible event and use that to make decisions. Statisticians do the same thing.

The probability of any event occurring can only range from 0 (certain not to occur) to 1 (certain to occur), or more formally,

0 £ P(A) £ 1

In order to assign a probability of occurrence to a particular event we have to first define all the possible outcomes

For most events this is not possible, but some like tossing a coin, rolling a die, pulling a card out of a deck, it is very obvious what outcomes exist. When it isn't, such as going out on Saturday night, we can divide the possible outcomes into mutually exclusive and exhaustive categories, generally by defining the event of interest (going out) and then defining all other possibilities as the negative of that event (not going out).

For a coin toss it is either heads or tails. For a die it is 1 of 6 sides coming up. For going out on Saturday night we would arbitrarily divide the possible outcomes into going out or not going out.

Next we have to assign a probability to each of the possible outcomes. To figure a probability of a single event occurring you need to

Identify all of the outcomes that match your criterion (Example: roll a number less than 4 on a die = 3)

Identify all possible outcomes (Example: roll of a die = 6 possible outcomes)

Calculate the ratio: outcomes of interest/all possible outcomes (For the die example: 3/6 = 1/2 = .5)

Or more formally:

As an aside for those of you who have wondered what an odds is--you've probably heard the term in relation to betting

The odds that you are given in gambling is really the odds against an event occurring and the formula is:

Odds against event A = (total number of outcomes - number of outcomes that are event A) to (number of outcomes that are event A)

So when someone tells you that that they will give you 7 to 3 odds on the San Francisco 49ers what they are offering you is a bet that the probability that the 49ers will win is .3 or 30% of possible outcome are the 49ers winning. If you think it more likely than that, that's a good bet; if you think that probability is too high, ask for better odds. 3 to 2 odds mean that the probability of your bet winning is estimated at 1 - 2/5 = 60%

Obviously, if we list all possible mutually exclusive and exhaustive outcomes, only one of these events above can be true, and at least one is true. So we can say:

P(A) + P(B) = 1

So the P(your sleeping being noticed) + P(your sleeping not being noticed) = 1

Adding probabilities

Sometimes we want to add together the probabilities of different events, such as what is the probability that I will draw a king or a black card from a deck of cards in one draw?

There are 4 kings and 26 black cards. You might think we could just add these together, but if we did so we would be counting black kings twice. So what we need to do is to subtract the duplicates.

Symbolically,

P(A or B) = P(A) + P(B) - P(A and B)

or for this example,

Multiplying probabilities

Sometimes we want to ascertain the probability of the joint occurrence of two or more events. For example, we may want to know, what is the probability of selecting a class to enroll in next quarter that is both interesting and easy. Instead of needing to know only whether a class is either interesting or easy, we desire to know both of these outcomes simultaneously.

We do this by multiplying probabilities. But the way we do this varies by the relationship of the two probabilities. Two events can be either independent or dependent. For example, if you think interesting classes are no more likely to be easy than boring classes then you are really saying the for each class the probability of being interesting is independent of the probability of being easy. If instead, you believe that interesting classes tend to also be easier classes then what you are saying is that for each class if we know that it is interesting, it changes our expectations about the probability of it being easy. The two qualities are dependent.

When multiplying probabilities, we say that the P(B) is conditional on what happened first or P(B|A) (read: probability of B given A). So the P(A and B) = P(A)P(B|A).

One of the ways in which the issue of independence and dependence arises is in the area of sampling

Repeated sampling with replacement--Independence

When we sample with replacement, each time we sample the probabilities for an event occurring remain the same, because we are sampling from an identical pool of possible outcomes each time.

This is another way of saying that each sampling is independent of the other.

So, the probability of A occurring in trial 2 = P(A in trial 2|outcome in trial 1) = P(A)

To combine probabilities where events are independent, multiply the individual probabilities.

Example: What is the probability of tossing 5 heads in a row?

Repeated sampling without replacement

When we sample without replacement, we change the domain from which we are sampling. Each sampling is dependent upon what occurred before.

Here, the probability of an event is conditional on prior events.

To combine probabilities where events are dependent, we have to multiply the changing probabilities

Example: What is the probability of pulling 4 aces in a row from a deck of cards? The first time we have four aces, the second time we assume that the first time we drew an ace (otherwise we have zero chances of drawing 4 aces, right?).

We can use these addition and multiplication rules of probability to figure out all sorts of complicated events.

With replacement: So, if I have two black marbles and one red marble in a bag and draw twice, each time replacing the marble I drew, the probability that I would draw the red marble only once in two draws is:

There are two outcomes that match this:

First outcome: I draw a red marble and then I draw a black marble

The P(A and B) = 1/3*2/3 = 2/9

Second outcome: I draw a black marble first and then I draw a red marble

The P(A and B) = 2/3*1/3 = 2/9

The probability that either one of these outcomes occur is

P(A or B) = 2/9 + 2/9 = 4/9

But, the probability I would draw it twice, that is both times, is only: 1/3*1/3 =1/9

Without replacement: If I have two black marbles and one red marble in a bag and draw twice without looking and without replacing the marble, the probability that I will draw the one red marble in those two draws is:

First possibility that meets criteria:

Event A: I draw a red marble first

Event B: I draw a black marble second

P(A and B)= P(A)P(B|A) = 1/3 * 2/2 = 2/6

Second possibility that meets criteria:

Event C: I draw a black marble first

Event D: I draw a red marble second

P(C and D) = 2/3*1/2 = 2/6

Since either possibility will work, I now have to figure out the probability that either will occur:

P (first possibility or second possibility) = P(1^st) + P(2^nd) = 2/6 + 2/6 = 4/6

REVIEW FOR MIDTERM 1 --Midterm 1 covers lectures 1-7 only

Topics we have covered in this section

The difference between populations and samples

Experiments vs. observational studies

Variables (and values of variables) vs. cases or subjects

Types of variables

Making histograms (you should be able to make and interpret histograms)

Distributions

Mean, median, mode

Spread, standard deviations

The normal distribution

z scores and using the Table in the book

Chance error vs. bias

Probabilities of events and combining probabilities

Independence vs. dependence

Sampling with or without replacement

Calculation skills needed

You should be able to calculate a mean

To calculate a standard deviation

To calculate a z-score

To use means and standard deviations, the normal table, and z-scores to answer questions

To calculate probabilities

I will put the formula sheet and the normal table from the book on the exam