Up to now we have only looked at what are called "univariate" statistics. This means we are only studying a single variable in a given population or a given sample. For example, we might talk about mean height, mean weight, the standard deviation of an LSAT score.
But now, we turn to relationships between two variables.
A scatterplot is a two dimensional plot of data. The horizontal dimension is called x, and the vertical dimension is called y.
Each point on a scatterplot shows two values, an x value and a y value. Each point represents a single case. A single case could be a single person or object, but a single case could be a matched pair (e.g. father-son, twins, husband-wife)
Handout
There is a POSITIVE relationship if above-average values of x are associated with above-average values of y. conversely, there is a NEGATIVE relationship if above-average values of x are associated with below average values of y.
In the Social Sciences, X and Y are usually called the INDEPENDENT and DEPENDENT variables respectively. They are given these names because the independent variable is thought to influence the dependent variable.
There is nothing to stop us from reversing the relationship.
The correlation coefficient can take values from -1 to +1. Values near zero mean that the data is not close to a straight line. Values near the ones (both positive and negative) mean that the data is very close to a straight line.
Your text gives you a very long formula for calculating the correlation coefficient (pp 132-134) and I am not certain how useful it is. Instead, read the technical note on p. 134, the formula is reproduced here:
(average of the products xy) - ((average x) * (average y)) r= --------------------------------------------------------- (Standard Deviation x) * (Standard Deviation y)
Answer: r = -0.47.
x y product of x & y ---- ---- ---------------- 2 7 14 3 3 9 5 1 5 8 4 32 13 2 26 Average: 6.2 3.4 17.2 Stdev : 3.9699 2.0591 r = (17.2) - (6.2 x 3.4) -------------------- = -0.47 (3.9699 x 2.0591)
x y x y --- --- --- --- 1 2 4 -12 1 3 6 -12 2 6 12 -11 3 5 10 -10 5 9 18 -8 7 8 16 -6 11 8 16 -2 13 4 8 0 13 7 14 0
Since the new list is just a transformation of the old list (i.e., the "new" x = 2y, and the "new" y = x-13), the correlation is the same as in the previous list: r=0.415.
Note:If you only modify one of the lists (either x or y) by adding or multiplying by a constant, it will not change the correlation.