So how does the test differ if Seller Door can't assume that they know the population standard deviation, σ? As mentioned earlier, they will use the sample standard deviation, s, as an estimate for σ. And, as mentioned before, this will affect the sampling distribution followed by X.
This then affects the critical values, which in turn affects the region of rejection. The critical values and region of rejection now fall in the t-distribution, not the standard normal distribution Z. Also, the test statistic for the sample mean calculated by Seller Door is going to be a t-score.
It might sound like a lot of things have changed! However, in terms of the methodology of the test, they actually haven't. Most of what we just mentioned is just 'numbers'. Of course, it is important that you know what sampling distribution is at work when you conduct a test! But it won't have a large impact on exactly how you do the test. For completeness sake, we'll go through all of the steps in the test without assuming σ is known. However, we will be paying most attention to those steps that differ to earlier. Many of the steps don't!
State the hypotheses. The company is still testing whether the average sales routine time has changed from the old value of 620 seconds:
H0: μ = 620
HA: μ ≠ 620
Assume the null hypothesis is true. This assumption is made whenever a test is conducted! So, as before, we assume that μ = 620.
Choose a level of significance, α. So that we can focus on the things that do differ between the two tests for the mean, we will let Seller Door make all of the same decisions (and use the same data) in both tests. So, we'll again suppose that they are using a level of significance of α = 0.05.
Determine the critical value(s). This is the first step that differs between the two tests. As the general step-by-step guide suggests (and as was the case in Seller Door's first test) there will be two critical values because this is a two-sided test. However, when σ is unknown, the critical values are t-scores from the t-distribution with n - 1 degrees of freedom.
Since Seller Door collects a sample of n = 36 time values, the two critical values are tα/2 and -tα/2 from the t-distribution with n - 1 = 35 degrees of freedom. The level of significance is α = 0.05, so the critical values are t0.025 and -t0.025.
Statistical software can be used to show that these critical values are 2.03 and -2.03.
Determine the region of rejection. While the numbers have changed, the region of rejection is specified by the critical values, just as when σ is known. When σ is unknown, the region of rejection is a region of the t-distribution, as shown to the right. Since there are two critical values, the region of rejection is the region outside of them: the set of values greater than 2.03 and values less than -2.03.
Collect a sample and calculate a sample statistic. In order to highlight the differences between the two tests, we will let Seller Door use the same data set from earlier when assuming that σ was known. That is, we'll again suppose they collect a sample of 36 time values and calculate a sample mean of x = 628.8 seconds.
At this stage of the test, if the population standard deviation is not known, Seller Door must calculate one more thing: the sample standard deviation! This will be used as an estimate for σ. So let's suppose that Seller Door calculated a sample standard deviation of s = 29 seconds.
Calculate the test statistic. Since σ is unknown, the test statistic for the sample mean will be a t-score in the t-distribution with 35 degrees of freedom. In particular, assuming the null hypothesis is true, the appropriate transformation formula is:
| X - μ0 |
| S/√n |
This follows the t-distribution with n - 1 degrees of freedom. We use this formula to calculate the test statistic for the sample mean, substituting the sample standard deviation (s = 29) and the sample size (n = 36) into the equation:
| t | = |
|
||
| = |
|
|||
| = | 1.82 |
Conclusion. Seller Door is now in a position to conclude the hypothesis test. As always, the conclusion rests on whether or not the test statistic is in the region of rejection.
The test statistic of 1.82 is between -2.03 and 2.03 and is therefore not in the region of rejection. Therefore, Seller Door does not reject the null hypothesis. There is not enough evidence to conclude that the average sales routine time has changed.
Notice that this conclusion is different to the conclusion that was drawn when σ was known.
This won't always be the case - sometimes the two tests might agree. But this example demonstrates that the two tests are different. Seller Door used the same level of significance and the same sample size in both cases. (Actually, it used the same sample in both cases!) And yet the company came to different conclusions in the two tests. So, while the methods in the two tests are very similar, the numbers do matter!
On a practical note, in reality Seller Door would probably be using the latter of these two tests. That is, they would not assume that they know the population standard deviation. The reason for this is pretty straightforward: they probably don't know it, so why assume that they do?
The fact that they calculated a sample standard deviation of 29 seconds, which is noticeably more than the assumed value of 24 seconds, suggests that their assumed value was potentially incorrect. Perhaps their assumed value of 24 seconds for the population standard deviation was based on the old routine? That is, the company had enough data to know that the population mean time taken in the old routine was 620 seconds. Perhaps they knew that σ was 24 seconds in that population, which is why they wanted to use this value for the new routine. But since they changed the routine, it is potentially dangerous to assume that the population standard deviation hasn't changed.
In any case, it is safer not to assume things you don't know in statistics!