The usual two numbers summarizing a distribution are the "center" [the "typical" value] and the "spread" [how close the data are to each other].Suppose you had some money to invest and your financial planner recommends two funds that he thinks will suit your needs. Let's call one fund FUND A and it has a 5 year average annual return of 10.36% and the other is FUND B with a 5 year average annual return of 11.86%. Which one do you invest in?
The Standard Deviation may be thought of as the average size deviations of the individual members of a list from the average of the list.In general, most numbers in a list are only one standard deviation away from the average. A few numbers will be two standard deviations away. And very few will deviate beyond that.
a. STANDARD DEVIATION is abbreviated as SD or as a lowercase "s".
b. The SD is defined as follows: given a list of n numbers x1, x2, ... , xn,
___________________________________________ . _ _ _ . (x1 - x)2 + (x2 - x)2 + ... + (xn - x)2 s = . -------------------------------------------- \/ n ________________ . _ . sum (xi - x)2 = . ----------------- \/ n _ where x is the average of the n numbers. An equivalent, easier formula for the SD is ________________________ . . sum xi2 - (sum xi)2/n s = . ------------------------- \/ nExample: Remember the First-year Law Student's grades?
93, 90, 81, 80, 77
The sum of these is 421
n = 5 (she took 5 courses)
the mean (x-bar) is 84.2 (421 divided by 5)
the SD is:
calculate the first chunk
(932 + 902 + 812 + 802 + 772)=(8649 + 8100 + 6561 + 6400 + 5929) = 35639
calculate the second chunk
4212 divided by 5 = 35,448.2
subtract the second chunk from the first
35639 - 35448.2 = 190.80
divide the result by 5 (i.e. n):
190.80 divided by 5 = 38.16
and take the square root:
SQRT(38.16) = 6.1774
A Table may help you see what is going on:
Original Score | Deviation from Average | Squared Deviation from Average |
---|---|---|
93 | 8.8 | 77.44 |
90 | 5.8 | 33.64 |
81 | -3.2 | 10.24 |
80 | -4.2 | 17.64 |
77 | -7.2 | 51.84 |
Sum = 421 | Sum = 0 | Sum = 190.80 |
Average = 84.2 | SD = SQRT(190.80/5) = 6.1774 |
c. Note that the standard deviation is in the same units as the data. In her case, it's points. This is why the standard deviation involves the square root, it allows for easier interpretation than points-squared for example.d. As with the mean, it is not necessary to know HOW MANY items are in a list when computing a standard deviation, only the relative frequency of the values in the list.
e. The SD measures how close the numbers in the list are to the average; i.e., not all numbers are equal to the mean; the SD is a measure of the "average" distance between each point and the average.
f. Another thing about the standard deviation. For many datasets 68% of the entries on a list will fall within one SD of the average. 95% of the entries will fall within two SD of the average. But we'll talk about this some more in Chapter 5.
Suppose I tell you that the last 5 years of returns for FUND A are 9.3, 8.2, 11.5, 12.7, 10.1 and the last 5 years of returns for FUND B are -2.4, 15.3, 66.2, 29.9, -49.7...the standard deviation for A is 1.59% and the standard deviation for B is 38.16%.
A. Given the list 1,2,3,4,5,6,7,8,9,10,11,12: a. n=12 b. median = 6.5; Q1 = 3.5, Q3 = 9.5 c. mean = 78/12 = 6.5 d. range = 12-1 = 11 e. sum (x-xbar)2 = 143; sum x = 78, sum x2 = 650; SD = sqrt(143/12) = sqrt((650-782/12)/12) = 3.452 B. Given the list 1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9, 10,10,11,11,12,12 a. n=24 b. median = 6.5; Q1 = 3.5, Q3 = 9.5 c. mean = 156/24=6.5 d. range = 12-1 = 11 e. sum (x-xbar)2 = 286; sum x = 156, sum x2 = 1300; SD = sqrt(286/24) = sqrt((1300-1562/24)/24) = 3.452 NOTE: everything stayed the same. What matters here is the relative frequency of a value.
A. Suppose we are given a list of numbers x1, x2, ... , xnB. Suppose we construct a new list by adding a constant "a" to each number in the old list.
a. Picture: shifted histogram.
b. The median and the mean both go up by a.
c. The range and SD are unchanged!
C. Suppose we construct a new list by multiplying each number
xi by some constant "b".a. Picture: stretched histogram
b. The median and the mean are both multiplied by b.
c. The range and SD are also multiplied by b.
A. Range: the largest value less the smallest valueB. Standard Deviation (SD): s = sqrt((sum[xi - xbar]2)/n)
= sqrt((sum xi2 - ((sum x)2/n))/n)
Exercise Set D: 7, 8
Exercise Set E: 4, 5
Review Exercise: 4, 6