Summary
- One of the major decisions that a statistician has to make when they conduct a hypothesis test is to agree on a level of significance
for the test.
- The lower the level of significance, the more evidence the statistician requires before agreeing to reject the null hypothesis.
- It is possible to commit two types of errors in a hypothesis test, known as Type I errors and
Type II errors.
- Committing an error does not mean that you have done anything 'wrong' in the test - it simply means that the conclusion from the test
does not match reality, which can occur because the observed sample data was unlikely.
- The first type of error occurs when you reject a null hypothesis that happens to be true. This is known as a
Type I error.
- The second type of error occurs when you do not reject a null hypothesis that happens to be false. This is known as a
Type II error.
- The probability of a Type I error occurring is equal to the level of significance, α, for the test. That is, this probability
can be fixed by the tester.
- The probability of a Type II error occurring is denoted β. It cannot be fixed by the tester and cannot even be exactly determined.
- The probability of a Type II error (which occurs when the null hypothesis is false and so the proposed value for the population
parameter is incorrect) will depend on what the population parameter actually is, and how far away it is
from the value proposed in the null hypothesis.
- For a fixed sample size, α and β are inversely proportional. Therefore, if the tester decreases α, they will increase
β (at a fixed sample size). Conversely if the tester wants β to decrease, they will have to increase the chance of a Type I
error.
- However using a larger sample size allows the tester to decrease the probability of either error without increasing the probability
of the other. They can also decrease the probability of both errors occurring. But larger samples cost more money, time and resources.
- The power of a test is the probability of not committing a Type II error. That is,
it is the probability of rejecting the null hypothesis when it is false, (1 - β).
- The power of a test can be thought of as a measure of the tester's ability to draw a conclusion from that test.
- The power of a test will depend on how different (if at all) the true population parameter is from the value proposed in the null
hypothesis. The more different, the more powerful the test is.
- However, it is not necessarily wise to increase the power of a test by deliberately proposing an 'extreme' value for the population
parameter.
- The power of a test can be increased by either of the following approaches (or a combination of both):
- increasing the level of significance of the test
- collecting a larger sample