Hypothesis Testing
Null and Alternative Hypothesis
• A hypothesis is a proposed explanation for a phenomenon. For a
hypothesis to be a scientific hypothesis, the scientific method
requires that one can test it.
• Researchers want to support their hypotheses, but the techniques
available to them are better for showing that something is false.
• The logical route is to propose exactly the opposite of what they want
to demonstrate to be true and then disprove or falsify that
hypothesis. What is left (the initial hypothesis) must then be true
(Kranzler & Moursund, 1995).
Null and Alternative Hypothesis
• Null hypothesis (H0)
The hypothesis predicting that no difference exists between the
groups being compared.
• Alternative hypothesis (Ha), or research hypothesis
The hypothesis that the researcher wants to support, predicting that
a significant difference exists between the groups being compared.
Examples
A researcher wants to examine the relationship between the type of
after-school program attended by a child and the child’s intelligence
level. The researcher is interested in whether students who attend
after-school programs that are academically oriented (math, writing,
computer use) score higher on an intelligence test than students who
do not attend such programs.
H0 : Children who attend academic after-school programs have the
same intelligence level as other children
Ha : Children who attend academic after-school programs have
different IQs than other children
• One-tailed hypothesis (directional hypothesis) An alternative hypothesis in which
the researcher predicts the direction of the expected difference between the
groups.
• Two-tailed Hypothesis (non-directional hypothesis): An alternative hypothesis in
which the researcher predicts that the groups being compared differ but does not
predict the direction of the difference.
• Mathematical Symbols Used in H0 and Ha:
When we use inferential statistics, we are
trying to reject H , which means that H is
0 a
supported.
Examples
• We want to test whether the mean GPA of students in Philippine
universities is different from 2.0 (out of 5.0). The null and alternative
hypotheses are:
H0: μ = 2.0
Ha: μ ≠ 2.0
• We want to test if college students take less than four years to
graduate from college, on the average. The null and alternative
hypotheses are:
H0: μ ≥ 4
Ha : μ < 4
Remember
• In a hypothesis test, sample data is evaluated in order to arrive at a
decision about some type of claim. If certain conditions about the sample
are satisfied, then the claim can be evaluated for a population. In a
hypothesis test, we:
• Evaluate the null hypothesis, typically denoted with H0. The null is not
rejected unless the hypothesis test shows otherwise. The null statement
must always contain some form of equality (=, ≤ or ≥)
• Always write the alternative hypothesis, typically denoted with Ha or H1,
using less than, greater than, or not equals symbols, i.e., (≠, >, or <).
• If we reject the null hypothesis, then we can assume there is enough
evidence to support the alternative hypothesis.
• Never state that a claim is proven true or false. Keep in mind the
underlying fact that hypothesis testing is based on probability laws;
therefore, we can talk only in terms of non-absolute certainties.
Type I and Type II Errors in Hypothesis Testing
When you perform a hypothesis test, there are four possible outcomes depending on the actual
truth (or falseness) of the null hypothesis H0 and the decision to reject or not.
The four possible outcomes in the table are:
• The decision is not to reject H0 when H0 is true (correct decision).
• The decision is to reject H0 when H0 is true (incorrect decision known
as aType I error).
• The decision is not to reject H0 when, in fact, H0 is false (incorrect
decision known as a Type II error).
• The decision is to reject H0 when H0 is false (correct decision whose
probability is called the Power of the Test).
• Each of the errors occurs with a particular probability. The Greek
letters α and β represent the probabilities.
• α = probability of a Type I error = P(Type I error) = probability of
rejecting the null hypothesis when the null hypothesis is true.
• β = probability of a Type II error = P(Type II error) = probability of not
rejecting the null hypothesis when the null hypothesis is false.
• α and β should be as small as possible because they are probabilities
of errors. They are rarely zero.
• The Power of the Test is 1 – β. Ideally, we want a high power that is
as close to one as possible. Increasing the sample size can increase
the Power of the Test.
Statistical Significance and Errors
Statistical significance : An observed difference between two descriptive
statistics (such as means) that is unlikely to have occurred by chance.
• E.g. statistical significance at the .05 level (also known as the .05 alpha
level)
• To say that a result has means that a difference as large as or larger than
what we observed between the sample and the population could have
occurred by chance only 5 times or less out of 100. In other words, the
likelihood that this result is due to chance is small. If the result is not due to
chance, then it is most likely due to a true or real difference between the
groups. If our result is statistically significant, we can reject the null
hypothesis and conclude that we have observed a significant difference in
IQ scores between the sample and the population.
Remember that when we reject the null
hypothesis:
• Either we are correct
• Or are making a type I error
• Thus, when we adopt .05 alpha, as often at 5 times out of 100 we
could make type I error
• .05 level then is the probability of making type I error, thus it is
referred to also as probability-value or p value
• Which type of error , Type I or Type II do you think is considered more
serious by researcher? Type I error
Steps in hypothesis testing
• Specify the Null Hypothesis.
• Specify the Alternative Hypothesis.
• Set the Significance Level
• Calculate the Test Statistic and Corresponding P-Value.
• Drawing a Conclusion.
Inferential Statistics
• Inferential statistics : Refers to procedures for drawing conclusions
about a population based on data collected from
a sample.
Inferential Statistical Tests
Key Ingredients
• Z Scores
• The Normal Curve
• Sample and Population
• Probability
• Central Limit Theorem
• The Standard Error
Central Limit Theorem
• The Central Limit Theorem states that the sampling distribution of
the sample means approaches a normal distribution as the sample
size gets larger — no matter what the shape of the population
distribution. This fact holds especially true for sample sizes over 30.
• A theorem which states that for any population with mean 𝜇 and
standard deviation 𝜎 , the distribution of sample means for
sample size N will have a mean of 𝜇 and a standard deviation of
𝜎/ 𝑁 and will approach a normal distribution as N approaches
infinity.
Sampling distribution
• Sampling Distribution : A distribution of sample means based on
random samples of a fixed size from a population.
• Standard Error of the Mean: The standard deviation of the sampling
distribution.
Z test
• Remember that a z-score tells us how many standard deviations
above or below the mean of the distribution an individual score falls.
In z-test we are comparing a sample mean with the population mean.
Calculating One tailed z-test
We want to determine whether the sample of children in academic after-school programs represents a
population with a mean IQ higher than the mean IQ of the general population of children. We already
know 𝜇 (100) and 𝜎 (15) for the general population of children. The null and alternative hypotheses for a
one-tailed test are
Interpreting one-tailed z test
z critical value, or z —the value of a test statistic that marks the
cv
edge of the region of rejection in a sampling distribution. The
region of rejection is the area of a sampling distribution that
lies beyond the test statistic’s critical value; when a score falls
within this region, H is rejected.
0
• z (N=75)=2.06, p< .05 (one tailed)
• zobt > zcv ; +2.02 > ±1.645
• Therefore, H0 is rejected and support our alternative
hypothesis that the sample mean represents a population of
children in academic after- school programs whose mean IQ
is higher than 100.
Two-tailed z test
For example, imagine you are conducting a study to see whether children
in athletic after-school programs weigh less than children in the general
population. What are H0 and Ha for this example?
Assume that the mean weight of children in the general population (𝜇) is 90
pounds, with a standard deviation 𝜎 of 17 pounds. You take a random
sample (N= 50) of children in athletic after-school programs and find a mean
ത of 86 pounds. Given this information, you can test the hypothesis
weight (𝑋)
that the sample of children in athletic after-school programs represents a
population with a mean weight that is lower than the mean weight for the
general population of children.
Two-tailed z test
This should get us:
Two-tailed z test
Statistical Power
• Statistical power refers to the probability of correctly rejecting a false H0.
• How to increase statistical power? Reduce standard error of the mean
• Note: as the sample size increases, the standard error of the mean decreases.