This week we're going to explore the differences between
samples and populations somewhat. To get the most out of this lab,
you should read
4.3 (on random variables) and 1.3 (density curves and
the Normal distribution) in the book. Do this before beginning the
lab.
We can think of the computer as our population. Excel has a number of features that can be used to simulate different random variables. At the end of the last lab, you were asked to look at the Normal random variable. We're going to focus on this distribution for this lab. We'll think of the computer as having access to our population and imagine we're studying a variable that has a normal distribution. This means that if we could see every value our random variables takes on, we could describe their distribution with a normal curve.
To warm up, take a sample of size 25 from a normal population
with mean 0 and SD 1. To do this, use the Random Number Generation
tool which can be
found under "Tools: Data Analysis" on the menu bar.
These 25 observations represent a sample from the population. Through the use of appropriate descriptive statistics, explain how they look and do not look like a normal distribution.
This is one of the interesting things about Statistics; even though we know these data come from a normal distribution, it doesn't necessarily look that way. We'll leave it to more advanced courses to determine when a sample probably came from a Normal distribution.
Let's compare your sample with the normal distribution.
First, sort your 25 observations. (Select the column of numbers
and choose "sort" from the Tools menu.) You can now easily find the
quartiles without really having to use the computer. (Or, go ahead
and use Excel if you wish.) Remember that the first quartile, Q1,
has approximately 25% of the observations below it. In the population,
what number has 25% of the rest of the population below it ?(assuming
the population follows a normal distribution with mean 0 and SD 1.) To
find this out, we use the Function Wizard:
Highlight the cell on your spreadsheet next to your Q1 .A window will open up and invite you to type in a probability. Type in 25% for probability, 0 for mean, and 1 for SD.
On the menu bar, click on the function button (a script "f" with an x as a subscript.)
Under "function category" select "Statistical". Under "function name" select "Norminv"
The result is the number in the normal population that has 25% below it. This is the 25th percentile. Is it close to your Q1?
Do the same for the other quartiles. How do they compare?
The Norminv function accepts a probability and tells us what value has that area below it on the normal curve. But we can go backwards: if I give it a value, what percent of the normal curve is below it? Put differently, if I were to reach into the population at random, what's the probability of getting a number less than or equal to x?
To see how to do this, let's find the probability of selecting
a number from the population that's less than 1.0.
Highlight a cell in which you want the answer to appear.The result is a probability. You should get about 0.84. This means that 84% of the time, you will select a number less than 1.0. You've already selected 25 numbers from this population. What percent of them are less than 1.0? How does this compare to .84?
Select the function button.
Under "function name", select Normdist
For "x" type 1.0
For "mean" type 0
For "SD" type 1
For "Cumulative" type True
Hit Enter.
Now, generate five different sets of 25 observations
from a normal distribution with mean 0 and SD 1. Sort each set.
For each set, calculate
a) the average, b) the SD, c) the percentage of the
observations less than 1.0. Put these in a table so that you can
compare them.
1) What's the biggest average you got? What's
the smallest?
2) Of all five sets, what's the biggest observation
you got? What's the smallest? What's the probability of getting
a number from
the population that's smaller than the biggest one
you got?
3) Which had the most variability? The observations
on the lists, or the averages of the lists?