Suppose we draw a simple random sample of size n from a large population. Call the observed values X1, X2, ..., Xn.An example might be -- draw a simple random sample (SRS) of 100 American women from the population of american women. Measure their heights.
Suppose the population has mean of _mu_ and a standard deviation of _sigma_. Then each Xi expected to have a mean = mu and standard deviation = sigma; in other words, each Xi is expected to be like the original distribution.From the example, a single Xi (height) is a measurement on one unit (woman) selected at random from the population and therefore, the Xi has the probability distribution of the population.
a. Define X-bar = (X1 + X2 + ... + Xn)/n. X-bar can be thought of as a sample selected at random from all possible samples on the population.b. It is easily shown that the expected value of x-bar is mu, the average of the population. In other words, the sample means will, on average, be equal to the population mean. x-bar is an unbiased estimator of mu
c. It is also easily shown that the standard deviation of x-bar is sigma/sqrt(n), where sigma is the standard deviation of the population. Thus, the standard deviation of the SAMPLE MEAN will be smaller than the standard deviation for individual measurements: it's easier to predict the average than it is to predict a single measurement.
Consider a population consisting of the elements 1, 2, 3, ..., 997, 998, 999, 1000. Then _mu_ = 500.5 and _sigma_ = 288.82.Draw a simple random sample of size 5 from the population. One such sample is 164, 582, 850, 892, 433. Then X-bar = 584.2. Note that X-bar is not exactly equal to mu, but it's close.
Draw a new sample: 286, 224, 344, 995, 491. Now X-bar = 468.0. Note that this X-bar is different from the previous X-bar; this is because X-bar is random.
A natural question to ask is how close X-bar will be to mu ... how accurate will our guesses be?
Given a simple random sample of size n from a population having mean mu and standard deviation sigma, the sample mean X-bar will come from a distribution with mean mu and std deviation = sigma/sqrt(n).
IF the original population had a normal distribution, then the distribution of the sample mean will also be normally distributed.Example. IQ scores are normally distributed with a mean of 100 and a standard deviation of 16. A sample of 25 persons is drawn. How likely is it to get a sample average of 108 or more? How likely is it for the first score to be 108 or more? (0.6 of 1%, 31%)
No matter what the distribution of the original population, if the sample size is "large", the distribution of the possible sample means will be close to the normal distribution.
Take a simple random sample from a population with mean _mu_ and standard deviation _sigma_. Let x-bar be the average of the sample values. If either(a) the original population is normally distributed, or
(b) the sample size _n_ is sufficiently large,then x-bar will be normally distributed with expected value _mu_ and standard deviation _sigma_/sqrt(n).
Intuitively, if the histogram for the population follows a normal curve, or if the sample size is large enough each time, then the histogram for the possible values for x-bar will follow a normal curve that has a mean of _mu_ and a standard deviation of _sigma_/sqrt(n). Thus, about 68% of the x-bar's will be within one standard deviation, about 95% of the x-bar's will be within two standard deviations, etc.
The Central Limit Theorem only applies to the distribution of possible sample averages; it says nothing about the distribution of individual scores in either the sample or the population.
A manufacturer claims his light bulbs last an average of 1200 hours with a standard deviation of 1000 hours. A random sample of 200 light bulbs is drawn and tested. If the manufacturer is correct, how likely is it to get a sample average of 800 hours or less? The s.d. is 1000/sqrt(200) = 70.7 hours, so the chance of getting an average of 800 or less has z = (800-1200)/70.7 = -5.66 ... about 0%.
Last Update: 4 November 1996 by VXL