Hypothesis
Hypothesis testing begins with an assumption, called a hypothesis that we make about a population
parameter. Then we collect sample data, produce sample statistics, and use this information to decide
whether our hypothesized population parameter is correct.
To test this validity of our assumption, we collect sample data and determine the difference between the
hypothesized value and actual value of sample mean. Then we judge whether the difference is
significant.
The smaller the difference, the greater the likelihood that our hypothesized value for the mean is correct.
The larger the difference, the smaller the likelihood.
The difference between the hypothesized population parameter and the actual statistic is more often
neither so large that we automatically reject our hypothesis nor so small that we accept it quickly.
So, in hypothesis testing, as in most significant real-life decisions, clear-cut solutions are the exceptions.
We cannot accept or reject a hypothesis about a population parameter simple by intuition. Instead we
need to learn how to decide objectively, on the basis of sample information, whether to accept or reject
a hunch/guess/ assumption.
Testing Hypothesis
A hypothesis is a statement about a population parameter whose validity is to be tested on the
basis of a random sample drawn from the population.
In hypothesis testing, we must assume a value of the population parameter before we begin
sampling.
This assumption we wish to test is called null hypothesis and is symbolized H0 or (H sub zero)
Example : suppose we intend to test hypothesis that the population mean is equal to 350.
That is “ the null hypothesis is that the population mean is equal to 350”
H0: = 350
In the problem it can be represented as H 0
“The hypothesized value of the population mean”
We consider three possible alternative hypotheses:
H1: ≠ 350; Alternative hypothesis is that the population mean is not equal to 350
H1: > 350; Alternative hypothesis is that the population mean is greater than 350
H1: < 350; Alternative hypothesis is that the population mean is less than 350
One and Two tailed tests:
If the alternative hypothesis is one sided, test procedure is said to be one tailed or otherwise.
(BY: MUHAMMAD MEMON – IBA) PAGE (1)
Type I and Type II Errors
Rejecting a null hypothesis when it is true is called a Type I error, and its probability (which, we have
seen, is also the significance level of the test) is symbolized α (alpha).
Accepting a null hypothesis when it is false is called a Type II error, and its probability is symbolized
β (beta).
Preference for a Type I Error
Suppose that making a Type I error (rejecting a null hypothesis when it is true) involves the time and
trouble of reworking a batch of chemicals that should have been accepted. At the same time, making a
type II error (accepting a null hypothesis when it is false) means taking a chance that an entire group
of users of this chemical compound will be poisoned. Obviously, the management of this company
will prefer a Type I error to a Type II error and, as a result, will set very high level of significance in
its testing to get low βs.
Preference for a Type II Error
Suppose, on the other hand, that making a Type I error involves disassembling an entire engine at the
factory, but making a Type II error involves relatively inexpensive warranty repairs by the dealers.
Then the manufacturer is more likely to prefer a Type II error and will set lower significance levels in
its testing.
Level of Significance & Power of Test
The probability of making type I error is called the level of significance of the test denoted by.
The probability of making type II error is denoted by and (1 – ) is called the power of the test.
Thus, Power of Test is the probability of rejecting a false null hypothesis.
Table showing Correct & Incorrect Decisions in Hypothesis Testing
Decisions of the test for The Null Hypothesis is
the Null Hypothesis TRUE FALSE
Incorrect Decision
Correct Decision
Accept Type II Error
P(Correct Decision)=1 –
P(Type II error)=
Incorrect Decision
Type I Error Correct Decision
Reject P(Type I error) = P(Correct Decision)= 1 –
=Level of Significance =Power of the test
of the test
In hypothesis testing problems, both types of errors are to be minimized.
In practice the probability of type I error i.e. has been kept as fixed at a specified value and then type
II error is minimized. A value of 0.01, 0.02, 0.05 or 0.10 is usually fixed for before taking a sample.
The higher the significance level we use for testing a hypothesis, the higher the probability of rejecting
a null hypothesis when it is true.
(BY: MUHAMMAD MEMON – IBA) PAGE (2)
- In the following figure, we have illustrated a hypothesis test at three significance levels: 0.01,
0.10, and 0.50.
- We have indicated the location of the sample mean on each distribution.
- In parts a and b, we would accept the null hypothesis but in part c, we would reject this same
null hypothesis.
(BY: MUHAMMAD MEMON – IBA) PAGE (3)
Significance Level: (continued . . .)
• The purpose of hypothesis testing is not to question the computed value of sample statistic but
to make judgment about the difference between that sample statistic and hypothesized
population parameter.
• What criterion to use for deciding whether to accept or reject the null hypothesis.
• In statistical terms, the value 0.5 or 0.10, or 0,01 is called significance level.
• The remaining area i.e. 05 or 0.9 or 0.99, where no significant difference exists.
• If we assume the hypothesis is correct, then the significance level will indicate the percentage
of sample means that is outside certain limits.
• 0.1 area under the curve, where significant difference exists; we reject the null hypothesis and
0.9 of the area under the curve where we would accept the null hypothesis.
Test Statistic & Critical Region:
The numerical values of the test statistic for which the null hypothesis is rejected are called critical
values of the test and these values constitute a region called critical region or rejection region of the
test.
Rejection Rule:
If the absolute value of the test statistic computed using sample data exceeds the absolute critical value
of the test, the null hypothesis is rejected.
When the null hypothesis is rejected, the test is called significant and when the null hypothesis is
not rejected, the test is called insignificant.
Therefore, test of hypothesis is also called test of significance.
Points to Note:
(i) Rejection of H0 indicates that an extremely unlikely sample has been drawn which
implies that H0 is very likely to be false.
(ii) Failing to reject H0 does not prove that H0 is true.
(iii) In testing hypothesis, the assumption is always made that the sample used in the test
process is a random sample.
(iv) It is assumed that the sampling distribution of the test statistic is known.
(v) H0, HA and are determined before the test is carried out.
(BY: MUHAMMAD MEMON – IBA) PAGE (4)
Formal Testing Procedure: hypothesis testing involves following six steps:
Step 1: Set up H0 and HA. HA decides whether the test is one or two tailed.
Step 2: Specify the level of significance. ()
Step 3: Select an appropriate test-statistic (z or t-test) and compute the value of the test-statistic using
sample data assuming null hypothesis to be true.
Sample size Normal Population Non-Normal Population
n σ known σ Unknown σ known σ Unknown
n>30
z-test z-test z-test None
large sample
n≤30
z-test t-test None None
small sample
Thus, t test is used only if:
(i) The population is normal,
(ii) σ is unknown (but s is known or can be computed), and
(iii) n≤30
Step 4: Determine the critical values and the critical region of the test (using z or t table)
Step 5: State a rule to reject the null hypothesis
If |𝒛𝒄𝒂𝒍 | > |𝒛𝒕𝒂𝒃 |, reject H0 and accept HA
OR
If |𝒕𝒄𝒂𝒍 | > |𝒕𝒕𝒂𝒃 |, reject H0 and accept HA
Step 6: If the numerical value of the test-statistic (i.e. 𝒛𝒄𝒂𝒍 or 𝒕𝒄𝒂𝒍 ) falls in the rejection
region, we reject the null hypothesis, in other case accept the null hypothesis.
Decide if the null hypothesis is to be rejected and write the conclusion of the test.
The test will be significant if H0 is rejected otherwise the test will be insignificant.
(BY: MUHAMMAD MEMON – IBA) PAGE (5)
Question: Given the following hypothesis:
H₀: µ ≤ 10
H1 :µ > 10
For a random sample of 10 observation, the sample mean was 12 and the
sample standard deviation 3. Using the .05 significance level:
A. State the decision rule.
B. Compute the value of the test statistic.
C. What is your decision regarding the null hypothesis?
Solution:
A.
Decision Rule: A decision rule is a statement of the specific conditions under which the
null hypothesis is rejected.
Ho : µ ≤ 10
H1 : µ > 10 (Here HA shows that the test is one-tailed i.e. right-tailed test)
Sample size is 12 i.e. upto 30. Test statistic for a mean, when population standard deviation is
not known, is t. We get t-value from t-distribution table which is 1.833.
Hence Decision Rule is: Reject H0 when tcal > 1.833
B. Calculation of t-statistic
𝑥̅ − 𝜇𝐻𝑜 𝑥̅ − 𝜇𝐻𝑜 12 − 10
𝑡𝑐𝑎𝑙 = = 𝑠 = = 2.108
𝜎̂𝑥̅ 3
√𝑛 √10
(BY: MUHAMMAD MEMON – IBA) PAGE (6)
C. Decision regarding the null hypothesis
Here |𝑡𝑐𝑎𝑙 | > |𝑡𝑡𝑎𝑏 | , therefore reject null Hypothesis and accept the alternative
hypothesis that the mean is greater than 10.
Observed Significance Levels: p-values
The measure of disagreement is called the observed significance level (or p-value) for the test.
P-value, for a specific statistical test, is the probability of observing a value of the test statistic which
disagrees with the null hypothesis, and supportive to the alternative hypothesis, as the actual one
computed from the sample data.
α is the significance level, (rejection or critical region) taken prior testing the statistic, whereas p-value
(probability of rejecting ) as computed from sample data:
For the above example, the value of the test statistic computed for the sample of n=50, we calculated z
= 2.12. Therefore, the observed significance level (p-value) for this test is
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑃(𝑧 > 2.12)
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑜. 5 − 𝑃(0 ≤ 𝑧 ≤ 2.12) = 0.5 − 0.4830 = 0.0170
Here the p-value is less than chosen value of α (i.e. observed significance level is less than significance
level), then we reject the null hypothesis, and accept alternative hypothesis.
Note: In contrast, if we choose α = 0.01, we would not reject the null hypothesis because the p-value
for the test is larger than 0.01.
Steps for calculating the p-value for a test of Hypothesis:
1. Determine the value of the test statistic z corresponding to the result of the sampling experiment.
2. (a) If the test is one-tailed: the p-value is equal to the tail area beyond z in the same direction
as the alternative hypothesis. Thus, if the alternative hypothesis is of the form >, the p-
value is the area to the right of, or above the observed z value. Conversely, if the
alternative is of the form <, the p-value is the area to the left of, or below the observed z
value.
(b) If the test is two-tailed: the p-value is equal to twice the tail area beyond the observed z
value in the direction of the sign of z. That is, if z is positive, the p-value is twice the area
to the right of, or above the observed z value. Conversely, if z is negative, the p-value is
twice the area to the left of, or below the observed z value.
Decide whether to Reject Null Hypothesis (H0)
1. Choose the minimum value α that we are willing to tolerate.
2. If the observed significance level (p-value) of the test is less than the chosen value of α, reject
the null hypothesis. Otherwise, do not reject the null hypothesis.
(BY: MUHAMMAD MEMON – IBA) PAGE (7)
Question: Suppose building specifications in a certain city require that the average breaking strength
of residential sewer pipe be more than 2400 pounds per foot of length (i.e. per linear foot).
Each manufacturer who wants to sell pipe in the city must demonstrate that its product
meets the specification. To study their claim, a random sample of 50 pipes was selected and
observed the mean and standard deviation as 2460 pounds per linear foot and 200 pounds
per linear foot respectively. Is this result consistent with the manufacturer claim at the 0.05
level of significance?
Solution:
Step-I:
Ho : µ ≤ 2400 pounds per foot of length
HA : µ > 2400 pounds per foot of length
Here HA shows that the test is one-tailed.
Step-2:
α = P(𝑧 > 1.645) = 0.05
Step-3:
Sample size is more than 30, therefore calculate z statistic:
n=50, 𝑥̅ = 2460, 𝜇𝐻𝑜 = 2400, s=200 , then
𝑥̅ − 𝜇𝐻𝑜 𝑥̅ − 𝜇𝐻𝑜 2460 − 2400
𝑧= = 𝑠 = = 2.12
𝜎̂𝑥̅ 200
√𝑛 √50
Step-4: Critical Region:
α = 0.05, HA show that test is one tailed, therefore critical region on right side
will be 0.05. From the standard normal probability distribution table we get
critical value for 0.05
𝑧0.05 = 1.645
Rejection Region or Critical Region: 𝑧 > 1.645
Step-5: Rejection Rule
Ho : µ ≤ 2400,
If |𝒛𝒄𝒂𝒍 | > |𝒛𝒕𝒂𝒃 |, reject H0 and accept HA
Step-6: Conclusion:
|𝒛𝒄𝒂𝒍 | = 𝟐. 𝟏𝟐 and |𝒛𝒕𝒂𝒃 | = 𝟏. 𝟔𝟒𝟓
Here |𝒛𝒄𝒂𝒍 | > |𝒛𝒕𝒂𝒃 |, therefore Ho can be rejected.
Null hypothesis is rejected. Hence the test is significant and conclude that the
company’s pipe has a mean strength that exceeds 2400 pounds per linear foot,
and thus the sample result is consistent with the manufacturer’s claim.
(BY: MUHAMMAD MEMON – IBA) PAGE (8)
Question: An insurance company reports that the average annual maintenance cost for a Pakistan-made
Szuaki car is currently Rs. 3675. A random sample of 100 customers has mean annual
maintenance cost of Rs. 3806 and standard deviation of Rs. 710. Is this result consistent with
the company’s report at the 0.05 level of significance?
Solution:
Step-I:
Ho : µ = Rs. 3675
HA : µ ≠ Rs. 3675 (Here HA shows that the test is two-tailed. )
Step-2:
Here α = 0.05, test is two tailed therefore we divide α equally between the lower
𝛼
and upper tail of the distribution of z, so = 0.025
2
Step-3:
Sample size is more than 30, therefore calculate z statistic:
n=100, 𝑥̅ = 3806, 𝜇𝐻𝑜 = 3675, s=710 , then
𝑥̅ − 𝜇𝐻𝑜 𝑥̅ − 𝜇𝐻𝑜 3806 − 3675
𝑧= = 𝑠 = = 1.845
𝜎̂𝑥̅ 710
√𝑛 √100
Step-4: Critical Region:
α = 0.05, HA show that test is two tailed, therefore critical region on each side
will be 0.025. From the standard normal probability distribution table, we get
critical value for 0.025
𝑧𝛼 = 𝑧0.025 = 1.96
2
Rejection Region or Critical Region: 𝑧 < −1.96 𝑜𝑟 𝑧 > 1.96
Step-5: Rejection Rule
Ho : µ = Rs. 3675,
If |𝒛𝒄𝒂𝒍 | > |𝒛𝒕𝒂𝒃 |, reject H0 and accept HA
Step-6: Conclusion:
|𝒛𝒄𝒂𝒍 | = 𝟏. 𝟖𝟒𝟓 and |𝒛𝒕𝒂𝒃 | = 𝟏. 𝟗𝟔
Here |𝒛𝒄𝒂𝒍 | ≯ |𝒛𝒕𝒂𝒃 |, therefore Ho cannot be rejected.
HA is then rejected. Hence the test is insignificant and the sample
result is therefore consistent with the company’s report.
(BY: MUHAMMAD MEMON – IBA) PAGE (9)
Other approach (p-value)
For the above example, the value of the test statistic computed for the sample of n= 100 customers, we
calculated z = 1.845. Therefore, the observed significance level (p-value) for this test is
The test is two tailed, and z is positive, then the p-value will be twice the area to the right, or above, the
observed z value , i.e.
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 2 ∗ 𝑃(𝑧 > 1.845)
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 2 ∗ [𝑜. 5 − 𝑃(0 ≤ 𝑧 ≤ 1.845)] = 2 ∗ [0.5 − 0.46745] = 2 ∗ 0.03255 = 0.0651
Here the p-value is not less than chosen value of α (i.e. observed significance level is more than
significance level), then we cannot reject the null hypothesis, and we reject HA.
QUESTION:
The roofing contract for a new sports complex in San Francisco has been awarded to parkhill associates,
a large building contractor. Building specifications call for a movable roof covered by approximately
10,000 sheets of 0.04-inch-thick aluminum. The aluminum sheets cannot be appreciably thicker than
0.04 inch because the structure could not support the additional weight. Nor can the sheets be
appreciably thinner than 0.04 inch because the strength of the roof would be inadequate. Because of this
restriction on thickness, Parkhill carefully checks the aluminum sheets from its supplier. Of course,
parkhill does not want to measure each sheet, so it randomly samples 100.
The sheets in the sample have a mean thickness of 0.0408 inch. From past experience with this supplier,
Parkhill believes that these sheets come from a thickness population with a standard deviation of 0.004
inch.
On the basis of these sample statistics, Parkhill must decide whether to accept the shipment of 10,000
sheets or it may reject the aluminum sheets sent by the supplier. (take α=0.05)
Solution:
Step-I:
Ho : µ = 0.04
HA : µ ≠ 0.04 (Here HA shows that the test is two-tailed. )
Step-2:
Here α = 0.01, test is two tailed therefore we divide α equally between the lower
𝛼
and upper tail of the distribution of z, so = 0.025
2
Step-3:
Sample size is more than 30, therefore calculate z statistic:
(BY: MUHAMMAD MEMON – IBA) PAGE (10)
n=100, 𝑥̅ = 0.0408, 𝜇𝐻𝑜 = 0.04, σ=0.004 , then
𝑥̅ − 𝜇𝐻𝑜 𝑥̅ − 𝜇𝐻𝑜 0.0408 − 0.04
𝑧= = 𝜎 = =2
𝜎𝑥̅ 0.004
√𝑛 √100
Figure: Probability that x will differ from hypothesized µ by 2.
Step-4: Critical Region:
α = 0.05, HA show that test is two tailed, therefore critical region on each side
will be 0.005. From the standard normal probability distribution table we get
critical value for 0.025
𝑧𝛼 = 𝑧0.025 = 1.96
2
Rejection Region or Critical Region: 𝑧 < −1.96 𝑜𝑟 𝑧 > 1.96
Step-5: Rejection Rule
Ho : µ = 0.04
If |𝒛𝒄𝒂𝒍 | > |𝒛𝒕𝒂𝒃 |, reject H0 and accept HA
Step-6: Conclusion:
|𝒛𝒄𝒂𝒍 | = 𝟐 and |𝒛𝒕𝒂𝒃 | = 𝟏. 𝟗𝟔
Here |𝒛𝒄𝒂𝒍 | > |𝒛𝒕𝒂𝒃 |, therefore null hypothesis will be rejected.
Hence the test is significant and Parkhill could conclude that a population with a true
mean of 0.04 inch would not produce a sample like this. The project supervisor would
reject the aluminum company’s statement about the mean thickness of the sheets.
(BY: MUHAMMAD MEMON – IBA) PAGE (11)
Question: A manufacturer of alkaline batteries may want to be reasonably certain that fewer than 5%
of its batteries remain defective by shipments. Suppose 300 batteries are randomly selected
from a very large shipment; each is tested and 10 defective batteries are found. Does this
provide sufficient evidence for the manufacturer to conclude that the fraction defective in the
entire shipment is less than 0.05 at α = 0.01? Also find the observed significance level for
the test.
Solution:
Step-I:
Ho : p = 0.05
HA : p < 0.05 (Here HA shows that the test is one-tailed. )
Step-2:
Here α = 0.01, test is one tailed.
Step-3:
10
𝑝0 = = 0.033
300
Before conducting the test of hypothesis, we check to determine whether the
sample size is large enough to use the normal approximation for the sampling
distribution of 𝑝0
The criterion is tested by the interval
𝑝𝐻0 (1 − 𝑝𝐻0 )
𝑝𝐻0 ± 3𝜎𝑝 = 𝑝𝐻0 ± 3√
𝑛
0.05∗0.95
= 0.05 ± 3√ = 0.05 ± 0.04 Or (0.01, 0.09)
300
Since the interval lies within (0, 1), the normal approximation will be adequate.
For 𝑝𝐻0 =0.05, we calculate z value
p0 − 𝑝𝐻0 0.033 − 0.05 −0.017
z= = = = −1.35
0.0126
√𝑝𝐻0 (1 − 𝑝𝐻0 ) √0.05 ∗ 0.95
n 300
Step-4: Critical Region:
α = 0.01, HA show that test is one tailed , therefore critical region on left side
will be 0.01. From the standard normal probability distribution table we get
critical value for 0.01
𝑧𝛼 = 𝑧0.01 = −2.33 (negative z value because of the left side)
Rejection Region or Critical Region: 𝑧 < −𝑧0.01 = −2.33
(BY: MUHAMMAD MEMON – IBA) PAGE (12)
Step-5: Rejection Rule
Ho : p = 0.05,
If |𝒛𝒄𝒂𝒍 | > |𝒛𝒕𝒂𝒃 |, reject H0 and accept HA
Step-6: Conclusion:
|𝒛𝒄𝒂𝒍 | = 𝟏. 𝟑𝟓 and |𝒛𝒕𝒂𝒃 | = 𝟐. 𝟑𝟑
Here |𝒛𝒄𝒂𝒍 | ≯ |𝒛𝒕𝒂𝒃 |, therefore Ho cannot be rejected.
HA is then rejected. Hence the test is insignificant & the sample result
is therefore not consistent with the manufacturer claim.
Hence, there is insufficient evidence at the 0.01 level of significance to indicate
that the shipment contains fewer than 5% defective batteries.
Determine observed significance Level (p-value)
For the above example, the value of the test statistic computed for the sample of n=300, we
calculated z = -1.35. Therefore, the observed significance level (p-value) for this test is
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑃(𝑧 < −1.35)
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑜. 5 − 𝑃(−1.35 ≤ 𝑧 ≤ 0) = 0.5 − 0.4115 = 0.0885
Here the p-value is greater than chosen value of α=0.01 (i.e. observed significance level is
greater than significance level), then we accept the null hypothesis, and reject the alternative
hypothesis.
.
(BY: MUHAMMAD MEMON – IBA) PAGE (13)