Hypothesis Testing
Agenda
In this session, you will learn about:
• Define: Sample and Population
• Formulate the hypothesis
• Select an appropriate test
• Choose level of significance
• Calculate test statistics
• Determine the probability
• Compare the probability and make
decision
Data Statistics
STATISTICS
The branch of
mathematics that
transforms data
into useful
information for
decision makers.
DESCRIPTIVE
INFERENTIAL
STATISTICS
STATISTICS
Collect Make inferences
Organize Hypothesis testing
Characterize Determine relationships
Present Data Make predictions
Dinesh Babu-Confidential@Copyright 2018
Descriptive Statistics
Descriptive statistics are methods for
organizing and summarizing data.
Example: Tables or graphs are used to organize
data, and descriptive values such as the average
score are used to summarize data.
A descriptive value for a population is called
a parameter and a descriptive value for a
sample is called a statistic.
Inferential Statistics
Inferential statistics are methods for using sample data to make general
conclusions (inferences) about populations.
POPULATION
Sample is typically only a part of
the whole population, sample
data provide only limited INFERENCE
information about the population.
As a result, sample statistics are
SAMPLE
generally imperfect
representatives of the
corresponding population
parameters.
General Example
What are these statements? How were they proved or disproved?
10,000 hours of
appropriately guided
Continents do practice is “the magic
not move number of greatness
Earth is flat
Stress causes
Ulcers
Men are better
Earth is the drivers than
center of women
universe
Business Examples
No difference in performance of the sales
team across geographies and product lines
Change in gas price will have no impact on
losses in automotive finance
Change in CEO will have no impact on the
stock price
Real estate yields are the same in all metros
Compensation changes will not impact
attrition
What is Hypothesis ?
A hypothesis is a tentative explanation for certain behaviors, phenomenon or
events that have occurred or will occur.
• A statistical hypothesis is an assertion concerning one or more
populations
– An educated guess
– A claim or statement about a property of a population
• The goal in Hypothesis Testing is to analyze a sample in an attempt to
distinguish between population characteristics that are likely to occur
and population characteristics that are unlikely to occur.
A hypothesis is a claim (assumption) about a population parameter
Process of Hypothesis Testing
Formulate the Null Hypothesis and the alternative hypothesis.
Select the appropriate test statistic.
Choose the level of significance, , and the Degree of Freedom
Compute the calculated test value of the test statistic
Compute the table test value of the test statisitc
Compare the calculated values and table values
Make the statistical decision and state the managerial conclusion.
Step 1: Hypothesis Formulation
The Null Hypothesis, H0
States the claim or assertion to be tested
• Example: The average number of TV sets in U.S. Homes is equal to
three ( H0 : μ )= 3
• Is always about a population parameter, not about a sample statistic
H0 : μ = 3 H0 : X = 3
• Always contains “=” , “≤” or “” sign
Dinesh Babu-Confidential@Copyright 2018
The Alternative Hypothesis, H1
• Is the opposite of the null hypothesis
– E.g.: The average number of TV sets in U.S. homes is not
equal to 3 ( H1: μ ≠ 3 )
• Challenges the status quo
• Never contains the “=” , “≤” or “” sign
• May or may not be proven
• Is generally the hypothesis that the researcher is trying to
prove
Dinesh Babu-Confidential@Copyright 2018
Hypothesis Testing
Null Hypothesis Alternative Hypothesis
• Statement about the value of • Statement about the value of
a population parameter a population parameter that
• Represented by H0 must be true if the null
• Always stated as an Equality hypothesis is false
• Represented by H1
• Stated in on of three forms
• >
• <
•
Process of Hypothesis Testing
Formulate the Null Hypothesis and the alternative hypothesis.
Select the appropriate test statistic.
Choose the level of significance, Confidence Interval, Degree of Freedom
Compute the calculated test value of the test statistic
Compute the table test value of the test statisitc
Compare the calculated values and table values
Make the statistical decision and state the managerial conclusion.
Hypothesis Tests for the Mean
Hypothesis Test Statistic
for
Two Tail Test One Tail Test
Known Unknown
(Z test) Left Tail Test Right Tail Test
(t test)
In two-tail test, there is a rejection region in both tails
In one-tail test, there is a rejection region in either right tail or left tail
Dinesh Babu-Confidential@Copyright 2018
Two Tail Test
Two-Tailed Tests
Test where the region of
rejection is on both sides of
the sampling distribution.
Defect Defect
Region Region
60 80
Speed limit in a freeway 60 – 80 mph (acceptable range of values).
Region of rejection would be numbers from both sides of the distribution,
that is, both <60 and >80 are defects.
Level of Significance and the Rejection Region
H0: μ = 3 Level of significance =
H1: μ ≠ 3
/2 /2
Critical values
Rejection Region
This is a two-tail test because there is a rejection region in both tails
Dinesh Babu-Confidential@Copyright 2018
Z Test of Hypothesis for the Mean (σ Known)
Convert sample statistic ( X ) to a ZSTAT test statistic
Hypothesis
Tests for
σKnown
Known σUnknown
Unknown
(Z test) (t test)
The test statistic is:
X−μ
Z STAT =
σ
n
Two-Tail Tests - Mean (σ Known)
H0: μ = 3
There are two cutoff values
(critical values), defining the H1: μ ¹ 3
regions of rejection.
/2 /2
3 X
Reject H0 Do not reject H0 Reject H0
-Zα/2 0 +Zα/2 Z
Lower Upper
critical critical
value value
t Test of Hypothesis for the Mean (σ Unknown) ->
Std Deviation unknown
Convert sample statistic ( X ) to a tSTAT test statistic
Hypothesis
Tests for
σKnown
Known σUnknown
Unknown
(Z test) (t test)
The test statistic is:
X−μ
t STAT =
S
n
Dinesh Babu-Confidential@Copyright 2018
Example: Two-Tail Test( Unknown)
• The average cost of a hotel room in New
York is said to be $168 per night.
• To determine if this is true, a random
sample of 25 hotels is taken and resulted
in an X of $172.50 and an S of $15.40.
• Test the appropriate hypotheses at =
0.05. H0: μ = 168
H1: μ 168
• (Assume the population distribution is
normal)
Dinesh Babu-Confidential@Copyright 2018
Example Solution: Two-Tail t Test
H0: μ = 168 /2=.02 /2=.025
H1: μ ¹ 168 5
• = 0.05 Reject H0 Do not reject H0 Reject H0
t 24,0.025
-t 24,0.025 0
• n = 25, df = 25-1=24 -2.0639 1.46
2.0639
• is unknown, so
use a t statistic
• Critical Value:
±t24,0.025 = ± 2.0639 Do not reject H0: insufficient evidence that true
mean cost is different than $168
Table Value = 2.0639; Calculated Value = 1.46 -> TV > CV -> Accept H0
Dinesh Babu-Confidential@Copyright 2018
Two-Tail T test ( Table Value or Critical Value)
Dinesh Babu-Confidential@Copyright 2018
Example Two-Tail t Test Using A p-value from Excel
• Since this is a t-test we cannot calculate the p-value without some
calculation aid.
• The Excel output below does this:
t Test for the Hypothesis of the Mean
Data
Null Hypothesis µ= $ 168.00
Level of Significance 0.05
Sample Size 25
Sample Mean $ 172.50
Sample Standard Deviation $ 15.40
Intermediate Calculations
Standard Error of the Mean $ 3.08 =B8/SQRT(B6)
Degrees of Freedom 24 =B6-1
t test statistic 1.46 =(B7-B4)/B11
Two-Tail Test
Lower Critical Value -2.0639 =-TINV(B5,B12)
Upper Critical Value 2.0639 =TINV(B5,B12)
p-value > α p-value 0.157 =TDIST(ABS(B13),B12,2)
So do not reject H0 Do Not Reject Null Hypothesis =IF(B18<B5, "Reject null hypothesis",
"Do not reject null hypothesis")
P-Value > 0.05 then -> Accept H0
Dinesh Babu-Confidential@Copyright 2018
One Tail Test
One-Tailed Tests
Acceptable
Region
Test where the region of
rejection is on only one side of
Reject/Defect
the sampling distribution. Region
6 10 12
Null Hypothesis: Response time to customer query <=10 minutes
Alternative Hypothesis: Response time > 10 minutes
Region of rejection would be the numbers greater than 10 (there is no bound on
the lesser time interval)
One-Tail Tests
In many cases, the alternative hypothesis focuses on a particular direction
Left Tail Test
This is a lower-tail test since the
H0: μ ≥ 3
alternative hypothesis is focused on
H1: μ < 3 the lower tail below the mean of 3
Right Tail Test
H0: μ ≤ 3 This is an upper-tail test since the
alternative hypothesis is focused on
H1: μ > 3
the upper tail above the mean of 3
In one-tail test, there is a rejection region in either right tail or left tail
Dinesh Babu-Confidential@Copyright 2018
Lower-Tail Tests -> One Tail Test -> Left Tail Test
There is only one critical
value, since the rejection H0: μ ≥ 3
area is in only one tail. H1: μ < 3
Reject H0 Do not reject H0
Z or t
-Zα or -tα 0
μ X
Critical value
Dinesh Babu-Confidential@Copyright 2018
Upper-Tail Tests (One Tail Test) -> Right Tail
There is only one critical
value, since the rejection H0: μ ≤ 3
area is in only one tail.
H1: μ > 3
Do not reject H0 Reject H0
Z or t Zα or tα
0
_
X μ
Critical value
Dinesh Babu-Confidential@Copyright 2018
Example: Upper-Tail t Test for Mean ( unknown)
• A phone industry manager thinks that customer monthly cell
phone bills have decreased, and now average less than $52 per
month.
• The company wishes to test this claim.
• Assume a normal population
Form hypothesis test:
H0: μ ≤ 52 the average is not over $52 per month
H1: μ > 52 the average is greater than $52 per month
(i.e., sufficient evidence exists to support the
manager’s claim)
Dinesh Babu-Confidential@Copyright 2018
Example: Find Rejection Region
• Suppose that = 0.10 is chosen for this test and n = 25.
• Find the rejection region:
Reject H0
= 0.10
Do not reject H0 Reject H0
0 1.318
Reject H0 if tSTAT > 1.318
Dinesh Babu-Confidential@Copyright 2018
One-Tail T test ( Table Value or Critical Value)
Dinesh Babu-Confidential@Copyright 2018
Example: Decisions
Reach a decision and interpret the result
Reject H0
= 0.10
Do not reject H0 Reject H0
1.318
0
tSTAT = 0.55
Do not reject H0 since tSTAT = 0.55 ≤ 1.318
there is not sufficient evidence that the
mean bill is over $52
Dinesh Babu-Confidential@Copyright 2018
Process of Hypothesis Testing
Formulate the Null Hypothesis and the alternative hypothesis.
Select the appropriate test statistic.
Choose the level of significance, Confidence Interval, Degree of Freedom
Compute the calculated test value of the test statistic
Compute the table test value of the test statisitc
Compare the calculated values and table values
Make the statistical decision and state the managerial conclusion.
Step 3: Choose a Level of Significance
• Level of Significance is also called as Error Rate
Scenarios:
• If = 0.01 then 1% error in the sample and remaining 99% accurate
• If = 0.05 then 5% error in the sample and remaining 95% accurate
• If = 0.10 then 10% error in the sample and remaining 90% accurate
Dinesh Babu-Confidential@Copyright 2018
Key Terms
Two key terms that you need to understand in Hypothesis Testing are:
Confidence Interval:
Measure for reliability
Degrees of Freedom:
of an estimate; sample
Number of values that
is used for estimating a
are free to vary in a
population parameter
study
so we need to know the
reliability of that
estimate
Confidence Interval
Confidence Interval Confidence level
• Describes the reliability of an Probability associated with the
estimate confidence interval
• Range of values (lower and
upper boundary) within which
the population parameter is
included
• Width of the interval indicates
the uncertainty associated with
the estimate
Example 1: Confidence Interval
“Mean energy consumption of various houses in a colony is 200 units with a
Standard Deviation of 20 units. ”
What does this
mean?
Discussion (Cont’d)
SOLUTION:
If the mean energy consumption of various houses in a colony is 200 units with a
standard deviation of 20 units, it means that:
68.2% consume energy
between 180 to 220 units
99% have their energy consumption
between 140 to 260 units
Thus for any given household in the
colony, there is a 99% confidence that the 20 20
energy consumption of the household 20 20
would be between 140 and 260 units. 20 20
140 160 180 200 220 240 260
-0.01 0.01
Example 2: Confidence Interval
• Consider mean demand for computers during assembly lead time is 350 units. our
operations manager wants to know whether the mean is different from 350 units.
Null Hypothesis - > H0: = 350
Thus, our research hypothesis becomes: H1: ≠ 350
• Recall that the standard deviation [σ]was assumed to be 75, the sample size [n]
was 25, and the sample mean was calculated to be 370.16
Confidence Interval Example
The testing procedure begins with the assumption that the null hypothesis is
true
Thus, until we have further statistical evidence, we will assume:
H0: = 350 (assumed to be TRUE)
The next step will be to determine the sampling distribution of the
sample mean assuming the true mean is 350.
is normal with 350
75/SQRT(25) = 15
Critical Value Approach
• If we define the guts as the center 95% of the distribution [this means =
0.05], then the critical values that define the guts will be 1.96 standard
deviations of X-Bar on either side of the mean of the sampling
distribution [350], or
• Upper Confidence Interval = Mean + (Table Value) * Std Dev.
• UCV = 350 + 1.96*15 = 350 + 29.4 = 379.4
• Lower Confidence Interval = Mean - (Table Value) * Std Dev.
LCV = 350 – 1.96*15 = 350 – 29.4 = 320.6
• Table Value ( = 0.05, Df = 24) = 1.96
Unstandardized Test Statistic Approach
Test Statistic:
• Since LCV (320.6) < (370.16) < UCV (379.4), we reject the null
hypothesis at a 5% level of significance.
Degrees of Freedom
Degrees of Freedom is the measure of number of values in a study that are free to vary.
For example, if you have to take ten different courses to graduate, and only ten
different courses are offered, then you have nine degrees of freedom.
SEM
SEM 2 SEM
1 3
In nine semesters, you will be
SEM SEM
able to choose which class to 10 4
Nine
take. In the tenth semester, Degrees
there will only be one class left of
Freedom
SEM SEM
to take – there is no choice. 9 5
SEM SEM
8 SEM 6
7
Degrees of freedom = No. of Rows – No. of Columns = 10 -1 = 9
Process of Hypothesis Testing
Formulate the Null Hypothesis and the alternative hypothesis.
Select the appropriate test statistic.
Choose the level of significance, Confidence Interval, Degree of Freedom
Compute the calculated test value of the test statistic
Compute the table test value of the test statistic
Compare the calculated values and table values
Make the statistical decision and state the managerial conclusion.
Critical Value Approach to Testing
Test the claim that the true mean # of TV sets in US homes is equal to 3.
(Assume σ = 0.8)
1. State the appropriate null and alternative hypotheses
• H0: μ = 3 H1: μ ≠ 3 (This is a two-tail test)
2. Specify the desired level of significance and the sample size
• Suppose that = 0.05 and n = 100 are chosen for this test.
• σ is assumed known so this is a Z test.
3. Collect the data and compute the test statistic
• Suppose the sample results are
n = 100, X = 2.84 (σ = 0.8 is assumed known)
So the test statistic is:
X − μ 2.84 − 3 − .16
Z STAT = = = = −2.0
σ 0.8 .08
n 100
Process of Hypothesis Testing
Formulate the Null Hypothesis and the alternative hypothesis.
Select the appropriate test statistic.
Choose the level of significance, Confidence Interval, Degree of Freedom
Compute the calculated test value of the test statistic
Compute the table test value of the test statistic
Compare the calculated values and table values
Make the statistical decision and state the managerial conclusion.
[Link]
Process of Hypothesis Testing
Formulate the Null Hypothesis and the alternative hypothesis.
Select the appropriate test statistic.
Choose the level of significance, Confidence Interval, Degree of Freedom
Compute the calculated test value of the test statistic
Compute the table test value of the test statistic
Compare the calculated values and table values
Make the statistical decision and state the managerial conclusion.
Hypothesis Comparison
Is the test statistic in the rejection region?
/2 = 0.025 /2 = 0.025
Reject H0 if Reject H0 Do not reject H0 Reject H0
ZSTAT < -1.96 or -Zα/2 = -1.96 0 +Zα/2 = +1.96
ZSTAT > 1.96;
otherwise do
not reject H0 Here, ZSTAT = -2.0 < -1.96, so the
test statistic is in the rejection
region
Process of Hypothesis Testing
Formulate the Null Hypothesis and the alternative hypothesis.
Select the appropriate test statistic.
Choose the level of significance, Confidence Interval, Degree of Freedom
Compute the calculated test value of the test statistic
Compute the table test value of the test statistic
Compare the calculated values and table values
Make the statistical decision and state the managerial conclusion.
Hypothesis Conclusion
• If the calculated value of the test statistic is greater than the critical value of
the test statistic , then null hypothesis is rejected.
Reach a decision and interpret the result
= 0.05/2 = 0.05/2
Reject H0 Do not reject H0 Reject H0
-Zα/2 = -1.96 0 +Zα/2= +1.96
-2.0
Since ZSTAT = -2.0 < -1.96, reject the null hypothesis and conclude there is
sufficient evidence that the mean number of TVs in US homes is not equal to 3
P-Value Approach to Testing
• P-value is the probability of obtaining a test statistic at least as extreme as the
one that was actually observed, assuming that the Null Hypothesis is true
• When P-value is less than a certain significance level (often 0.05), you "reject
the null hypothesis". This result indicates that the observed result is not due to
a random occurrence but a true difference.
Result is due
P-value < 0.05 Null to a true
Hypotheses difference
• Compare the p-value with = 0.05
– If p-value < 0.05 , reject H0
– If p-value 0.05 , do not reject H0
Errors in Hypothesis
Testing
Possible Errors in Hypothesis Test Decision Making
Type I Error
• Reject a true null hypothesis
• Considered a serious type of
error
• The probability of a Type I Error
is
• Called level of significance of
the test
• Set by researcher in advance
Type II Error
• Failure to reject false
null hypothesis
• The probability of a
Type II Error is β
Possible Scenarios in Hypothesis
Four possible scenarios:
Truth about the population
H0 true Ha true
Reject H0 Type I Correct
Decision based on error decision
sample
Accept H0` Correct Type II
decision error
Type I Error (α): Type II Error (β):
Reject the Null Hypothesis Accept the Null Hypothesis
when it is true when it is false
DINESH BABU - COPYRIGHT - EMAIL:RRRDINESH88@[Link]
Example
• A criminal trial is an example of hypothesis testing without the statistics.
• In a trial a jury must decide between two hypotheses. The null hypothesis is:
H0: The defendant is innocent
• The alternative hypothesis or research hypothesis is:
H1: The defendant is guilty
• The jury does not know which hypothesis is true. They must make a
decision on the basis of evidence presented.
DINESH BABU - COPYRIGHT - EMAIL:RRRDINESH88@[Link]
Justice
Null Hypothesis = “Person is innocent”
Decision
Prison Set free
Correct
True State
Innocent Type I error
decision
Guilty Correct
Type II error
decision
DINESH BABU - COPYRIGHT - EMAIL:RRRDINESH88@[Link]
Testing of hypotheses
Type I and Type II Errors. Example
Suppose there is a test for a particular disease.
If the disease really exists and is diagnosed early, it can be
successfully treated
If it is not diagnosed and treated, the person will become
severely disabled
If a person is erroneously diagnosed as having the disease and
treated, no physical damage is done.
To which type of error you are willing to risk ?
Dinesh Babu-Confidential@Copyright 2018
DINESH BABU - COPYRIGHT - EMAIL:RRRDINESH88@[Link]
Testing of hypotheses
Type I and Type II Errors. Example.
Decision No disease Disease
Not diagnosed OK Type II error
Diagnosed Type I error OK
irreparable damage
treated but not harmed
would be done
by the treatment
Decision: to avoid Type error II, have high level of
significance
Dinesh Babu-Confidential@Copyright 2018
DINESH BABU - COPYRIGHT - EMAIL:RRRDINESH88@[Link]
How Do We Control Type I Errors?
• The Type I error rate is controlled by the researcher.
• It is called the alpha rate, and corresponds to the probability cut-
off that one uses in a significance test.
• By convention, researchers use an alpha rate of .05. In other words,
they will only reject the null hypothesis when a statistic is likely to
occur 5% of the time or less when the null hypothesis is true.
• In principle, any probability value could be chosen for making the
accept/reject decision. 5% is used by convention.
DINESH BABU - COPYRIGHT - EMAIL:RRRDINESH88@[Link]
How Do We Control Type II Errors?
All else equal,
– β when the difference between hypothesized parameter
and its true value
DINESH BABU - COPYRIGHT - EMAIL:RRRDINESH88@[Link]
Type I & II Error Relationship
Type I and Type II errors cannot happen at the same time
• A Type I error can only occur if H0 is true
• A Type II error can only occur if H0 is false
If Type I error probability ( ) , then
Type II error probability ( β )
DINESH BABU - COPYRIGHT - EMAIL:RRRDINESH88@[Link]
&
Thank You
For Your
Attention
Dinesh Babu-Confidential@Copyright 2018