Statistics 10              Lecture 4          Numerical Summaries

1.      Measuring the Center or Central Tendency of a variable
Pictures are nice, people also find numerical summaries useful.  Chapter 4 deals with simple but important measurements. Quantitative variables have two important properties: a center or typical value and the spread around that value. 

2.      THE MEAN (or AVERAGE)

Much of the time we use what is known as the "arithmetic average." Calculate it by summing all of the values in a list of data and dividing by the number of values in that list. The statistical term for this commonly used and widely understood statistic is the mean.

3.      Calculating a mean

 (read as "x bar") and it is computed as follows: given a list of n numbers, x1, x2, ... , xn, the mean is calculated as

 

=   =  

4.                THE MEDIAN (another measure)

This is an alternative measure of "center."  It is the midpoint.  Why an alternative?  Each measure has its advantages and disadvantages when used to summarize data (see below).

5.         Calculating a median
Given a list of n numbers x1, x2, ... , xn, sort all the numbers from smallest to largest and then pick the middle number from the list.  If the list has an even number of elements, take the average of the two middle numbers.

6.      Properties of the mean and of the median
If you could place a histogram on a weightless bar and the bar on a fulcrum, the histogram would balance perfectly when the fulcrum is directly under the mean. By contrast, the median is the value where 50% of the area of the histogram lies above and 50% lies below this value.

If the histogram is symmetric, the mean and the median are the same. If the histogram is not symmetric, the mean and median can be quite different. Take a data set whose histogram is symmetric. Balance it on the fulcrum. Now take the largest observation and start moving it to the right. The fulcrum must move to the right with the mean, too, if the histogram is to stay balanced.

You can distort the mean with an outlier, but all this time the median stays the same!  (see p.63)

 

For an interactive example see: http://www.ruf.rice.edu/~lane/stat_sim/descriptive/index.html