MA401 Notes, Section 8.3

8.3 Hypothesis Testing

Idea: have some suspicion as to value of (unknown) population parameter; use a sample to test if hypothesis is true

ex:

Setup: use two complementary hypotheses:

H₀ = null hypothesis; what we suspect isn’t true
H₁ = alternate hypothesis; what we suspect is true

ex:

₀

₁

Approach: play devil’s advocate:

assume null hypothesis is true
take a sample, and compute a test statistic from the sample whose value will (hopefully) refute the assumption that the null hypothesis is true and allow us to reject it

In hypothesis testing, choose a set of outcomes for the test statistic that will be used to reject the null hypothesis; called the critical region or rejection region

ex:

What outcomes would suggest that the assumption p >= .40 is false?

Well, if p >= .40, we expect 8 or more to be registered Democratic; thus if our sample has fewer than 8 Democrats, the assuption that 40% or more of the students are registered Democratic would seem incorrect.

However: even if 40% of the students are Democrats, we certainly won't always get exactly 8 Democrats in every sample of 20 students; we'd expect the number to vary somewhat from sample to sample. For example, it wouldn't be all that unlikely that just due to random chance, we'd get a sample of 20 students in which only 7 are registered Democratic; thus this result wouldn't give strong evidence that there must be fewer than 40% Democrats in the student population.

Thus we really only get strong evidence that we should reject the null hypothesis if the number of Democrats in the sample is far less than the expected number of 8.

We'll use as our rejection region X <= 4;
if p >= .40, it is unlikely we’d get a sample with 4 or fewer Democrats in it.

In fact, we can quantify just how unlikely this is using probabilities:

Suppose p = .40; then the distribution of X will be the binomial distribution with n = 20, p = .40
Then the probability that X <= 4 is .0510, from the table of cumulative probabilities for the binomial distribution.

If p > .40, there’s even a smaller chance that X <= 4.

Thus we conclude that if the null hypothesis is true, there would be only a 5% chance we'd get a sample with 4 or fewer Democrats in it due just to sampling variation. While this could happen, it's quite unlikely, and thus it seems more likely that the null hypothesis is false, and that there are in fact fewer than 40% Democrats in the student population at large.

The value of .0510 is called the significance level a of the test

it’s the probability that the test statistic will fall into the rejection region when the null hypothesis is true (causing us to erroneously reject the null hypothesis)
usually choose a desired value for a, and then find the rejection region corresponding to this.

ex:

Well, assuming the null hypothesis is true, that p = .40 or greater, we can see from the table for the binomial distribution with n = 20 and p = .40 that

P(X <= 3) = .0160 and P(X <= 2) = .0036.

Errors

There are 4 possible outcomes of a hypothesis test:

Null hypothesis is true, and we erroneously reject it; called a Type I error
Null hypothesis is false, and we correctly reject it
Null hypothesis is true, and we correctly refuse to reject it
Null hypothesis is false, and we incorrectly refuse to reject it; called a Type II error

Table:

	H₀ true	H₀ false
reject H₀	Type I error	correct decision
accept H₀	correct decision	Type II error

Note:

for a hypothesis test, we specify the desired level of significance a, and use this to determine the rejection region; this specifies the probability of making a type I error.
this is usually the worse error to make: thus we want to limit the chance that we'll make it

ex:

₀

₁

Previous section Next section