In the previous lecture you learned about confidence intervals here you will learn about tests of significance. Recall that in STATISTICAL INFERENCE, the parameters are usually not known, and we draw conclusions from outcomes (i.e. sampling outcomes) to make guesses about the underlying parameters.Remember the basic idea: we make assumptions about the parameters, and then test to see if those assumptions could have led to the outcome we observed. We then use probability -- say through a probability calculation to express the strength of our conclusions.
You play a game of dice with Professor Lew. If you roll a 1,2, or 3 she pays you $1. If you roll a 4,5, or 6 you pay her $1. You roll the die 10 times and get:6 5 6 4 4 1 2 1 6 6
A test of significance simply asks: Does this die seem fair? That is, does the result of $7 for Professor Lew and $3 for you show evidence of cheating...or could Professor Lew have gotten $7 just by chance?
The mean of these 10 rolls is 4.1 the expected mean for a fair die is 3.5 and s.d. is about 1.9
To answer the question, you rely upon how the sample mean x-bar is expected to behave if the samples were repeated and if the mean were really equal to 3.5
If you do the calculations (4.1-3.5)/ 1.9/SQRT(10) = 1. You can find the probability associated with a Z score of 1 from table A.
where z = (observed - expected)/spread = (xbar - mu) / (sigma/sqrt(n)).
"If the null hypothesis were true, then there would be a p% chance to get these kind of results."
1. Clearly identify the parameter and the outcome.2. State the null hypothesis. This is what is being tested. A test of significance assesses the strength of evidence (outcomes) against the null hypothesis. Usually the null hypothesis is a statement of "no-effect" or "no difference"
3. The alternative hypothesis is the claim about the population that we are trying to find evidence in favor of. In the coin toss example, you are seeking evidence of cheating. The null would say that the mean is 3.5, the alternative would say it is larger than 3.5. Note this is a ONE-SIDED alternative because you are only interested in deviations in one direction. (A two-sided situation occurs when you do not know the direction, you just think the evidence suggests somthing different from the null)
4. The test statistic. It is the statistic that estimates the parameter of interest. In the above example, the paramter is mu and the outcome is x-bar and the test-statistic is Z.
The significance test assess evidence by examining how far the test statistic fall from the proposed null. In other words a six-sided die should have a distribution of mean 3.5 for a given number of tosses. I get 4.1. Is 4.1 far enough from 3.5 to suggest that something is not correct here?
To answer that question, you find the probability of getting an outcome as extreme or MORE than you actually observed. So to test the outcome, you would ask "what is the chance of get 4.1 or greater?"
5. The probability that you observe is called a P-VALUE. The smaller the p-value the stronger is the evidence against the null hypothesis. If instead you had rolled the die with Prof. Lew 100 times and paid out $70 the Z score would be 3.16 with a p-value of .0008
6. On significance levels. Sometimes prior to calculating a score and finding it's P-value, we state in advance what we belive to be a decisive value of P. This is the significance level. 5% and 1% significance levels are most commonly used. If your P-value is as small or smaller than the significance level you have chosen then you would say that "the data is statistically significant at level --- "
NOTE:
Significant is not the same as important. All it means is that the outcome you observed probably did not happen by chance.
Last Update: 11 November 1996 by VXL