Statistics 10
Lecture 12


CHANCE ERRORS IN SAMPLING (Chapter 20)

A. Overview

Sample Surveys always involve chance error.

Think back to the concepts of POPULATION and of SAMPLES. We're interested in the PARAMETER, but given resource constraints, we must settle for a STATISTIC.

The difference between the PARAMETER and the STATISTIC is chance error.

Some references are made to Chapter 16.1 here, you can read that if you wish, but you can manage this section without it.

B. Sampling Again (Chapter 20.1)

C. Sample Size and Standard Error

a. Idea

If we increase the size of the sample (assuming it is representative) we get a better "fix" on the PARAMETER. In Figure 2 (p. 358) we draw 250 samples of size 400 and now the range is 39% men to a high of 54%.

b. Equation

percentage in the population = percentage in a sample + chance error

The expected value = percentage in the population, the sample percentage will be off by chance error.

As long as you have a sample and not the population, you are almost certain to run into chance error.

c. Chance Error and the Standard Error

How big is the chance error? The STANDARD ERROR tells you this.

Standard Error = Square Root of Sample Size x Standard Deviation of "the box".

Example 1: for a sample of 100, the standard error is
Square Root 100 x Square Root ( .46 x .54) =
10 x .5 = 5
SE for a percentage = (SE for a number / size of sample) x 100 = 5%

Example 2: for a sample of 400, the standard error is
Square Root 400 x Square Root ( .46 x .54) =
20 x .5 = 10
SE for a percentage = (SE for a number / size of sample) x 100 = 2.5%

Note the relationship between the SE for a number and the SE for for a percentage. As the sample size increases, the SE for a number increases (look at the formula) but the SE for a percentage decreases.

Example 3: Problem 2, Exercise Set A

25,000 students, 10,000 are older than 25.
Find the expected value= number of "draws" x average of a box
160 = 400 x .4
it's like a box with 10,000 1's and 15,000 0's
Find the standard error of the number of students in the sample
standard error of the number = square root of sample size x SD of box = 20 x .5 = 10.
Find the standard error of the percentage of students
standard error of the percentage = (10 / 400) x 100 = 2.5%

The percentage of students in the sample who are older than 25 will be around 40% give or take 2.5%

D. Interpretation and the Normal Curve Again (20.3)

How do we work with:

"The percentage of students in the sample who are older than 25 will be around 40% give or take 2.5%"

We can convert these to standard units (Z scores) as in Chapter 5.

One standard error in this example is 2.5%, +1 standard error would be 40 +2.5 or 42.5%, -1 standard error will be 37.5%.

The chance that between 37.5% and 42.5% of any given sample of 400 students will be older than 25 is about 68%.

We can move from using the normal curve to figuring percentages to using the normal curve to make statements about chances.

The chance that between 35% and 45% of any given sample of 400 students will be older than 25 is about 95%

And 99%?

E. Correcting for Sampling without Replacement

Recall that a Simple Random Sample (SRS) is sampling without replacement.

To make a long story short, when populations are large, it doesn't really matter that one is sampling without replacement. But sample size does matter and it does affect the accuracy of the estimate.

Still, there is a correction factor and it is

Square root ( (population size - sample size )/ (population size - one))

It is really used when the sample is a substantial size of the population. Once a sample is only 1% of the population, the correction is negligible.

F. Homework


button Return to the Fall 1998 Statistics 10/50 Home Page

Last Update: 1 November 1998 by VXL