Statistics
10 Lecture 4 Numerical Summaries
1. Measuring the
Center or Central Tendency of a variable
Pictures
are nice, people also find numerical summaries useful. Chapter 4 deals with simple but important
measurements. Quantitative variables have two important properties: a center or
typical value and the spread around that value.
2. THE MEAN (or
AVERAGE)
Much
of the time we use what is known as the "arithmetic average."
Calculate it by summing all of the values in a list of data and dividing by the
number of values in that list. The statistical term for this commonly used and
widely understood statistic is the mean.
3. Calculating a mean
(read as "x
bar") and it is computed as follows: given a list of n numbers, x1, x2,
... , xn, the mean is calculated as
=
=
4.
THE MEDIAN (another
measure)
This is an alternative measure of
"center." It is the
midpoint. Why an alternative? Each measure has its advantages and
disadvantages when used to summarize data (see below).
5. Calculating a median
Given a list of n numbers x1, x2, ... ,
xn, sort all the numbers from smallest to largest and then pick the
middle number from the list. If the
list has an even number of elements, take the average of the two middle
numbers.
6. Properties of
the mean and of the median
If you could place a histogram on a weightless bar and the bar on a fulcrum,
the histogram would balance perfectly when the fulcrum is directly under the
mean. By contrast, the median is the value where 50% of the area of the
histogram lies above and 50% lies below this value.
If the histogram is symmetric, the mean and the median are
the same. If the histogram is not symmetric, the mean and median can be quite
different. Take a data set whose histogram is symmetric. Balance it on the
fulcrum. Now take the largest observation and start moving it to the right. The
fulcrum must move to the right with the mean, too, if the histogram is to stay
balanced.
You can distort the mean with an outlier, but all this time
the median stays the same! (see p.63)
For an interactive example see: http://www.ruf.rice.edu/~lane/stat_sim/descriptive/index.html