1. Recall: Random Variables
Definition -- usually denoted as X or Y it is
the numerical outcome of a random process or experiment.
Examples: tosses of a coin, rolls of a die, the
market close, a lottery, the amount of fluid in a bottle of beer, the number of
students present in a class.
2. Finishing Lecture 9:
Continuous Random Variables
A continuous random variable can assume an
infinite number of values in an interval. So for example, our beer
bottles can contain any amount of beer in an interval between 0 and 300
ml.
The most commonly observed continuous random
variable is the NORMAL distribution (Chapter 1.3). The probability
distribution is described by a curve and the probability of any event is
described by the area under the curve. We are always interested in the
probability for an interval rather than the probability of an exact
value. This is simply because the area under the curve at some exact
point will be zero.
Notation: Greek Letter mu or m is the
symbol for the mean of the normal distribution, Greek letter sigma, or s, the standard
deviation. In the Standard Normal that we use in Table A, the mean is m =0 and
the standard deviation s=1.
Example:
Suppose an automobile manufacturer claims their new SUV has mean in-city
mileage of 16 miles per gallon. Suppose you write to the manufacturer and
you find out that the standard deviation around that mean is 2 miles per
gallon. This information allows you to formulate a probability
model. So you think that the random variable "in city gas mileage"
can be approximated by a normal distribution with a mean of 16 and a
standard deviation of 2.
ASK:
What is does the distribution of the in-city gas mileage look like for the
population of these vehicles? What
percentage do we expect to be between 14 and 18 miles per gallon? What
percentage do we expect to be between 12 and 20 miles per gallon?
Example:
Suppose you work for a magazine that tests new autos and trucks. If you
were to test this SUV, what is the probability that the one you purchased averages
less than 13 miles to the gallon? What is the probability that you would
purchase one that gets more than 20 miles? Suppose you were to purchase
one that gets better than 20 miles per gallon, is your probability model
necessarily wrong?
Example: Suppose you are thinking about investing
some money in a mutual fund. Past data
shows that the fund returned a mean of 19.8% with a standard deviation of
13.40%. Suppose we know it is normally
distributed. Based on this information, what is the probability that you
will experience a loss (get a return of less than zero). This year-to-date, the fund has returned
-9.07 (a loss). What is the probability
of getting a return that low or lower?
Is the model necessarily wrong?
3.
Random Variables have means too (4.4)
The mean of a list (from Chapter 1.2) is x-bar. It's an ordinary
average that gives every value in the list equal weight. The mean of a random variable is also an
average, but slightly different, it assigns probabilities to the outcomes and
they do not need to be equal.
The examples above give a mean (and standard deviation) for a
random variable.
Symbols: the mean of probability distribution is Greek letter mu,
μ , and its standard deviation is denoted by sigma, σ , Where have you seen this before? (Chapter 1.3)
Generally, random variables are written μ x
pronounced "mu sub x" to represent the mean of any random variable x,
not just normal ones. What would the
symbol μ y mean to you?
4. The mean of a discrete random
variable: The Expected Value
A discrete
random variable is countable and finite.
Recall some outcomes are naturally discrete such as you can't roll a 3.1
or you can't have 351.7 employees at a firm or you can't have 3.91 customer
complaints today. Discrete variables
jump from one value to the next.
So
one can "list" the outcomes or what are known as the possible values
some random variable X can take. And
one can calculate the probabilities of each outcome.
Discrete
Random variables have probability distributions -- they are just a way of
organizing outcomes and representing them graphically. A table or a graph might suffice. There are only 2 requirements:
1)
probabilities must be greater than or equal to zero
2)
the sum of the probabilities must be 1.
Example: What are the possible outcomes for the
market close over a consecutive 3 day period?
Let's let random variable X represent the number of days observed when
the market closed above its previous day's high. Let's suppose the probability of
a "up" closing is .6 and "not up" is .4
If we table the
outcomes and calculate their probabilities:
f(X) |
0 |
1 |
2 |
3 |
p(X) |
.064 |
.288 |
.432 |
.216 |
Note that .064 + .288 + .432 + .216 = 1.0 and remember (.43 = .064, .63=.216, 3*(.42*.6)=.288, and 3*(.62*.4) = .432)
What would the mean of this random variable
be? We know that the market can behave
in this manner in any 3 consecutive trading days, but what is "most likely
to happen?"
To find the mean of this probability
distribution, or the mean of this random variable X, multiply each possible
outcome by its probability and add up the products:
The formula (see page 327) μ
x = x1p1
+ x2p2 + x3p3 + …
+ xnpn
So for the example above, μ x is (0*.064) + (1*.288) + (2*.432) + (3*.216) = 1.8
So in any 3 day trading period, you expect to
see a little less than 2 of them closing "up". Since this is a discrete random variable, you
expect between 1 and 2 up days in every 3 examined.
5. The mean of a continuous random variable
You usually need some calculus to calculate the
mean for a continuous random variable unless it comes from a very simply
symmetric distribution, such as a uniform distribution (it looks like a
brick). In this class, it is generally
given to you as the mean of a normal distribution. Remember, the normal distribution is a continuous probability
distribution and normal random variable is a continuous random variable. So in the examples on the SUVs and the
mutual funds above, you would be expected to make statements about the
distribution based on information given about the mean and standard deviation
of the variable.
6.
The Law of Large numbers
Was developed by Jacob
Bernoulli, a Swiss mathematician. He
wrote "In any chance event, when the event happens repeatedly, the
statistics will tend to prove the probabilities."
Most people, when they
hear the terms probabilities and statistics want to run away. It's not that
bad. Probabilities, in his mind, are simply theoretical results, or in the
class PARAMETERS. Statistics are nothing more than actual results coming from
our samples. Inserting the definitions, we have:
"In
any chance event, when the event happens repeatedly, the majority of sample
outcomes will tend to be near the theoretical parameter."
This is basically
common sense converted to mathematics. So if an experiment (like a sample, or
like a coin toss, or a roll of a die) is performed repeatedly under identical
conditions, the relative frequency of an event that occurs approaches its
probability of occurrence with increasing accuracy as the number of trials (or
the sample size) becomes “large.” For example, the experiment could be a coin
toss, and the event is the occurrence of a head. The experiment could be observing the close of the Dow Jones
Industrial Average and noting the frequency with which it close "up".
7.
Rules and properties of the mean of a random variable
Rule 1. μ
a+bX = a + b( μ x )
If a is a constant and b is a constant, then the mean of random variable
is the constant plus the random variable and the mean of a constant times a
random variable is the constant times the random variable.
Example: You sell real estate and this is your chance of selling a
certain number of homes in a given week:
Homes Sold |
0 |
1 |
2 |
3 |
4 |
Probability |
.1 |
.5 |
.3 |
.1 |
.0 |
The expected number sold (mean) is 1.4 homes. We get that from (0*.1) +(1*.5) +(2*.3) + (3*.1)+(4*.0) If you
must pay a $2500/week to the firm regardless of what you sell and you get $10,000
for each home sold, to find your expected net earnings: (-2500) + 10000(1.4) or
$11,500.
Rule 2. Given random variables X and Y, μ x +y = μ
x + μ y
. You can add the means of two different random variables together.
You fall in love and your partner also learns to sell real estate, your
partner's chance of selling a certain number of homes in a given week is:
Homes Sold |
0 |
1 |
2 |
3 |
4 |
Probability |
.1 |
.1 |
.5 |
.2 |
.1 |
You partner's expected number sold is 2.1 homes per week. But your partner does not pay anything to
the firm regardless but only gets $4000 for each home sold. What are your combined weekly earnings?
11,500 + (2.1*4,000) or $19,900
before taxes of course.