The usual two numbers summarizing a distribution are the "center"[the "typical" value] and the "spread" [how close or far the data are to each other].
a. The most commonly generated measure of "center" is the MEAN, often called the AVERAGE.
b. The mean is denoted as
_ x (read as "x-bar").c. The mean is computed as follows: given a list of n numbers
x1, x2, ... , xn, the mean isd. Example: A First-year Law Student enrolls in 5 courses, these are her grades at the end of the year:
_ x1 + x2 + ... + xn sum xi
x = -------------------- = --------
n n
93, 90, 81, 80, 77
The sum of these is 421 (sum xi from above)
n = 5 (she took 5 courses)
the mean (x-bar) is 84.2
a. The median is the "middle point" of a list: half of the data are larger than (or equal to) the median, and half of the data are smaller than (or equal to) the median.b. The median is computed as follows:
given a list of n numbers x1, x2, ... , xn,
sort all the numbers
and pick the middle number from the list.
If the list has an even number of elements, take the average of the two middle numbers.
c. Example:
The sorted law school grades: 77, 80, 81, 90, 93
The median (M) of this list is 81
If she had taken SIX classes instead of FIVE:
77, 80, 81, 90, 93, 97
Take the average of the middle two numbers (81 and 90), that is, 81+90 divided by 2 or 85.5
a. The mean is the "balancing point" of a histogram; the median simply divides the data in half.b. For a symmetric distribution, the mean equals the median.
c. The mean is sensitive to outliers and long tails! The median is not:
e.g., the list "77, 80, 81, 90, 93" has mean 84.2 and median 81;
if the list were changed to "17, 80, 81, 90, 93", the mean would be 72.2, but the median would still be 81.
d. It is not necessary to know HOW MANY numbers are in a list, only the RELATIVE FREQUENCY of the values; e.g., if she had 10 classes with scores: "77,77,80,80,81,81,90,90,93,93" the mean is still 84.2. As long as the scores in list maintain the relative frequencies (in this example: 20% x1's, 20% x2's, 20% x3's and so forth) the mean will be unchanged.
1. Given the list 1,2,3,4,5,6,7,8,9,10,11,12: a. n=12, sum=78 b. median = 6.5 c. mean = 78/12 = 6.5 2. Given the list 1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9, 10,10,11,11,12,12 a. n=24, sum= 156 b. median = 6.5 c. mean = 156/24=6.5 NOTE: the mean and median stayed the same. What matters here is the relative frequency of a value.
1. Suppose we are given a list of numbers x1, x2, ... , xn2. Suppose we construct a new list by adding a constant "a" to each number in the old list.
a. Picture: shifted histogram.
b. The median and the mean both go up by a.
3. Suppose we construct a new list by multiplying each number
xi by some constant "b".a. Picture: stretched histogram
b. The median and the mean are both multiplied by b.
1. Median: the median is the "middle number" of a liste.g. 77, 80, 81, 90, 93 the median is 81
2. Mean: the mean is x-bar = (sigma xi)/n <--- recall what this means
e.g. 77, 80, 81, 90, 93 the mean is 82.4
3. Outliers: suppose these are the grades instead
e.g. 17, 80, 81, 90, 93
the median is still 81 but the mean has changed to 72.2
Last Update: 7 October 1998 by VXL