Statistics 10 Lecture 16

Statistics 10
Lecture 16

INTRODUCTION TO HYPOTHESIS TESTING (Chapter 26.1-26.4)

A. Overview

In the previous lecture you learned about confidence intervals here you will learn about tests of significance. Recall that in STATISTICAL INFERENCE, the parameters are usually not known, and we draw conclusions from outcomes (i.e. sampling outcomes) to make guesses about the underlying parameters.
Remember the basic idea: we make assumptions about the parameters, and then test to see if those assumptions could have led to the outcome we observed. We then use a probability calculation to express the strength of our conclusions.

B. Example

Let's walk through the example at the beginning of Chapter 26 together.
A senator introduces a bill to simplify the tax code. His claim is the bill is revenue-neutral. Basically it won't change the amount of taxes the government collects, it just simplifies the law.
A law can be evaluated. The IRS could SAMPLE from the POPULATION of all tax returns, figure out the effect the bill would have on these revenues, and then check to see if the bill is really revenue-neutral.
In the example, the IRS samples 100 forms. The sample average comes out to -$219 which means that the government would have collected 219 fewer dollars from taxpayers. The sample standard deviation is $725.
The senator's argument (issued through an aide) is that the SD is so large, $725, that an average of -219 is inconsequential.
The IRS's argument is what you want to learn. To understand the -219 and the 725, you need to convert the sample SD to an SE for the sample average.
Remember what the SE is, it is variation association with sample statistics. The SD is the variation in a given sample (e.g. the 725 here) or the variation in a population, or in a list (see Chapter 4).
The IRS goes on to say, the senator may think/argue the that the population parameter is $0, they say it's not and it is also negative.
How do they figure that?
First, they calculate an SE for the sample average
Square Root (100) x 725
------------------------  = about $72
      100
Second, they set up a "test" and use Z as the "test statistic"
-219 - 0
------------------------  = about -3
     72
This test statistic says, in a way, that if the true parameter was zero dollars and the samples have a variation of $72 then the chance that you could have picked a sample of size 100 with a mean of -219 is about 0.1 of 1% which is the area to the left of -3 under the normal curve.
In previous chapters, you learned to work with the formula for a Z score and the normal curve. In Chapter 26, it all comes together.

C. Definitions

The NULL HYPOTHESIS is that the observed results are due to chance alone. That is, any differences between the parameter (the expected value) and the observed (or actual) outcome was due to chance along. In this case, the null hypothesis is a statement about a parameter: the population average is 0.
The ALTERNATIVE HYPOTHESIS is that the observed results are due to more than just chance. It implies that the NULL is not correct and any observed difference is real, not luck.
Usually, the ALTERNATIVE is what we're setting out to prove. The NULL is like a "straw man"
The TEST STATISTIC measures how different the observed results are from what we would expect to get if the null hypothesis were true. When using the normal curve, the test statistic is z,
where z = (observed value - expected value)/spread
All a Z does is it tells you how many SEs away the observed value is from the expected value when the expected value is calculated by using the NULL HYPOTHESIS.
The SIGNIFICANCE LEVEL (or P-VALUE). This is the chance of getting results as or more extreme than what we got, IF the null hypothesis were true. P-VALUE could also be called "probability value" and it is simply the area associated with the calculated Z.
p-values are always "if-then" statements:

"If the null hypothesis were true, then there would be a p% chance to get these kind of results."
If the p-value is less than 5%, we say the results are STATISTICALLY SIGNIFICANT; if p < 1%, the results are HIGHLY STATISTICALLY SIGNIFICANT. A "significant" result means that it would be unlikely to get such extreme observed values by chance alone.

D. Hypothesis Testing Summarized

1. Clearly identify the parameter and the outcome.
2. State the null hypothesis. This is what is being tested. A test of significance assesses the strength of evidence (outcomes) against the null hypothesis. Usually the null hypothesis is a statement of "no-effect" or "no difference"
3. The alternative hypothesis is the claim about the population that we are trying to find evidence in favor of. In the tax law example, you are seeking evidence that the law is not neutral. The null hypothesis would say that the average return will not change (i.e. 0) , the alternative would say it is negative . Note this is a ONE-SIDED alternative because you are only interested in deviations in one direction. (A two-sided situation occurs when you do not know the direction, you just think the evidence suggests somthing different from the null)
4. The test statistic. It is the statistic that estimates the parameter of interest. In the above example, the parameter is the population average and the outcome is the sample average and the test-statistic is Z.
The significance test assess evidence by examining how far the test statistic fall from the proposed null.
To answer that question, you find the probability of getting an outcome as extreme or MORE than you actually observed. So to test the outcome, you would ask "what is the chance of getting a -$219 or lower number?"
5. The probability that you observe is called a P-VALUE. The smaller the p-value the stronger is the evidence against the null hypothesis. If instead you had gotten a sample average of -$100 with the same SE, the senator may well be right. (A Z of about -1.4 has about 8% of the normal, so here, there was an 8% chance of getting a sample with an average of -$100 or lower).
6. On significance levels. Sometimes prior to calculating a score and finding it's P-value, we state in advance what we believe to be a decisive value of P. This is the significance level. 5% and 1% significance levels are most commonly used. If your P-value is as small or smaller than the significance level you have chosen then you would say that "the data is statistically significant at level --- "
NOTE:
Significant is not the same as important. All it means is that the outcome you observed probably did not happen by chance.

E. Homework Chapter 26 (due 11/20/98)

Exercise Set C: 1, 2, 7

Return to the Fall 1998 Statistics 10/50 Home Page

Last Update: 10 November 1998 by VXL