CHAPTER 6 French mathematician, also
worked on the method of least
NORMAL DISTRIBUTION squares. His 1805 publication
Also called the Gaussian provided the earliest published use
distribution of the least squares method.
A continuous probability Though Legendre didn’t develop
distribution that describes data the normal distribution explicitly,
that clusters around a mean. his work contributed to the
GAUSSIAN FUNCTION / BELL CURVE statistical framework that
The graph of the associated Gauss later expanded.
probability density function is bell- PIERRE-SIMON LAPLACE (1812)
shaped, with a peak at the mean. In the early 19th century, Pierre-
ABRAHAM DE MOIVRE (1733) Simon Laplace refined de Moivre’s
A French mathematician who ideas, contributing significantly to
was the first person to describe the theory of probability.
what would later be recognized as He extended de Moivre’s work and
the normal distribution. provided a general proof of the
De Moivre introduced the normal Central Limit Theorem in 1812,
distribution in his 1733 paper showing that the sum of many
“The Doctrine of Chances”, as independent random variables, no
part of his work on approximating matter the original distribution,
binomial distributions. tends to form a normal distribution
o He showed that as the as the number of variables
number of trials in a binomial increases.
distribution increased, the FRANCIS GALTON (19TH CENTURY)
shape of the distribution In the late 19th century, Francis
approached a smooth curve. Galton, a British polymath,
o This was one of the earliest extended the use of the normal
instances of what later distribution to biological data. He
became the Central Limit applied it to human characteristics
Theorem (CLT). such as height and weight,
He derived the formula for the discovering that many natural
normal distribution to approximate phenomena followed a normal
the probability of the number of distribution pattern.
heads or tails in a large number of Galton's work helped popularize
coin tosses, describing the bell- the use of the normal distribution
shaped curve. in areas beyond physical
measurement, and he introduced
CARL FRIEDRICH GAUSS (1809)
the concept of regression
The normal distribution is often
toward the mean, which also
called the Gaussian distribution
relates to the properties of the
due to the significant contributions
normal distribution.
of Carl Friedrich Gauss, a German
mathematician and physicist, in The development of the Central
the early 19th century. Limit Theorem proved that under
Gauss developed the normal broad conditions, sums or averages
distribution in his work on the of random variables independently
method of least squares in 1809. drawn from any distribution with
He used it to analyze astronomical finite variance will tend toward a
data and measurement errors, normal distribution, further
discovering that errors in solidifying its importance.
measurements tended to cluster
around a mean value with a PROPERTIES OF A NORMAL
symmetric, bell-shaped curve, DISTRIBUTION
which is now known as the normal A normal distribution is a continuous,
distribution. symmetric, bell-shaped distribution of
He showed that this distribution a variable.
was the best fit for various types of The known characteristics of the
measurement errors, explaining normal curve make it possible to
that small errors were more estimate the probability of occurrence
frequent than large ones, and of any value of a normally distributed
positive and negative errors were variable.
equally probable. The properties of the normal
distribution are as follows:
ADRIAN-MARIE LEGENDRE (1805)
Around the same time as Gauss, 1. The distribution is BELL-SHAPED.
Adrian-Marie Legendre, another
2. The mean, median, and mode variable. To do this, if X~N ( μ , σ 2),
are EQUAL and are located at the x−μ
center of the distribution then z= ~ N(0,1)
σ
3. The normal distribution is 2. A table of standardized normal
UNIMODAL values can then be used to obtain
4. The normal distribution curve is an answer in terms of the
SYMMETRIC about the mean (the converted problem.
shape are same on both sides. 3. If necessary, we can then convert
5. The normal distribution is back to the original units of
CONTINUOUS. measurement. To do this, simply
6. The normal curve is ASYMPTOTIC note that, if we take the formula for
(it never touches the x-axis). Z, multiply both sides by σ , and
7. The total area under the normal then add μ to both sides, we get
distribution curve is 1.0 or 100% x=zσ + μ
8. The area under the part of a 4. The interpretation of Z values is
normal curve that lies within 1 straightforward. Since σ =1, if z = 2,
standard deviation of the mean the corresponding X value is
68%; within 2 standard deviations, exactly 2 standard deviations
about 95%, and with 3 standard above the mean. If z = -1, the
deviations, about 90.7% corresponding X value is one
standard deviation below the
mean. If z = 0, X = the mean, i.e. μ
STANDARD NORMAL DISTRIBUTION .
A normal distribution can be DETERMINING NORMALITY
converted into a standard normal The easiest way to determining
distribution by obtaining, the z value. normality a bell-shaped distribution
A z-value is the signed distance or normally shaped is to draw a
between a selected value, designated histogram for the data and check
x, and the mean μ, divided by the its shape. If the histogram is not
standard deviation, σ. approximately normally shaped,
It is also called as z scores, the z then the data are not normally
statistics, the standard normal distributed.
deviates, or the standard normal Histogram in the context of a
values. normal distribution is a graphical
representation of data that helps
In terms of formula: visualize the distribution of a
x−μ dataset. It displays the
z=
σ frequency of data points across
where: different ranges or intervals
z=¿ z value (bins), and the shape of the
x=¿the value of any particular histogram helps indicate how
observation or measurements the data is distributed.
μ=¿the mean of the distribution Skewness can be tested by
σ =¿standard deviation of the distribution applying Pearson's index (Pl) of
skewness. If the +1 ≤ PI ≤ -1, it
a. General Procedure. can be concluded that the data are
As you might suspect from the significantly skewed if PI will fall
formula for the normal density outside the range. Also, the data
function, it would be difficult and should be tested for outliers
tedious to do the calculus every because even one or two outliers
time we had a new set of can have a big effect for normality.
parameters for μ∧σ .
So instead, we usually work with
the standardized normal
distribution, where μ=0∧σ=1, i.e. CENTRAL LIMIT THEOREM
N(0,1). The sampling distribution of the
That is, rather than directly solve a mean of a random sample drawn
problem involving a normally from any population is
distributed variable X with mean μ approximately normal for a
and standard deviation σ , an sufficient large sample size. The
indirect approach is used. larger the sample size, the more
closely the sampling distribution of
1. We first convert the problem into x will look like a normal
an equivalent one dealing with a distribution.
normal variable measured in The central limit theorem is
standardized deviation units, used to gain information about the
called a standardized normal sample mean when the variable is
normally distributed or when the
sample size is 30 or more.
x−μ
z=
σ
√n
where:
z=¿ z score
x=¿ the value of any particular
observation or measurement
μ=¿ population mean
σ =¿population standard deviation
σ x =¿sample standard deviation
x=¿sample mean
n=¿sample population
3 (x−median)
PI =
s
Q1−1 . 5 ( IQR ) Q1 +1 .5 ( IQR )
NORMAL APPROXIMATION TO THE CHAPTER 7
BINOMIAL DISTRIBUTION CONFIDENCE INTERVALS and
The Central Limit Theorem says as n SAMPLE SIZE
increases, the binomial distribution
with a trials and probability p of ESTIMATION
success gets closer and closer to a A one aspect of inferential statistics
normal distribution. It is the process of estimating the
That is, the binomial probability of any value of a parameter from information
event gets closer and closer to the drawn from a sample.
normal probability of the same event. The objective of estimation is to
The normal distribution has the same determine the approximate value of a
mean µ = np and standard deviation population parameter on the basis of
σ =npq a sample statistic.
Additional to the previous condition of We refer to the sample statistic as
np ≥ 5 and nq ≥ 5 a correction for the ESTIMATOR of the population
continuity may be used in the normal parameter. The computed sample
approximation. statistic is called ESTIMATE.
A correction for continuity is a
MEAN AND SAMPLE SIZE: 𝜎 KNOWN
correction applied when a continuous CONFIDENCE INTERVALS FOR THE
distribution is used to approximate a
discrete distribution. The continuity
correction is summarized in Table 6.1 A. Confidence Intervals
Binomial Normal Point Estimate
Distribution Distribution Is the value of a sample statistic
Where: Use: that is used to estimate a
1. P ( x=a ) P(a−0.5< x< a+0.5) population parameter.
2. P ( x ≥ a ) P(x >a−0.5)
A good estimator should satisfy
3. P ( x> a ) P(x >a+ 0.5)
these properties:
4. P ( x ≤ a ) P(x <a+ 0.5) 1) It should be UNBIASED
5. P ( x< a ) P(x <a−0.5) ESTIMATOR – a population
parameter is an estimator whose
μ=np σ= √ nqp expected value is equal to that
q=1− p parameter;
2) It should be CONSISTENT
ESTIMATOR– if the difference
between the estimator and the
( )
parameter grows smaller as the z a∗σ
2
sample size grows larger; n= 2
3) It should be RELATIVELY E
EFFICIENT ESTIMATOR – if there
are two unbiased estimators of a CONFIDENCE INTERVALS FOR THE
AND SAMPLE SIZE: 𝜎 UNKNOWN
parameter, the one whose variance MEAN
is smaller.
When 𝜎 is known or unknown and
𝑛 ≥ 30, the standard normal
Interval Estimate Of A Parameter
is an interval or a range of values
used to estimate the parameter. distribution is used to determine the
This estimate may or may not confidence intervals of the mean.
contain the value of the parameter Though, in many cases, the
being estimated. population standard deviation is
A degree of confidence (generally a unknown and the sample size is
percent) can be assigned before an less 30 (𝑛 < 30).
interval estimate is prepared. In this case, the sample standard
For example, a researcher may deviation can be used in place of
wish to be 99% confident that the the population standard deviation
interval contains the true
The 𝑡 distribution is the most
for confidence intervals.
population mean. Another question
will transpires. Why 99%? Why not appropriate and the variable is
95% or 90%? normally distributed.
Confidence Interval Formula for determining a
Each interval is constructed with confidence interval about the mean
regard to a given confidence level. by using the t-distribution:
The confidence level is
associated with a confidence
interval states how much
confidence we have that is interval
x−t a
2
( √σn )< μ< x +t ( √σn )
a
2
contains the true population
𝒏–𝟏
parameter. Degree Of Freedom:
denoted as 1 − 𝛼. Remember that
The confidence level and is
𝛼 represents the significance level.
of the mean for the specific 𝛼:
Formula for the confidence interval
x−z a
2
( √σn )< μ< x + z ( √σn )
a
2
x−z a
( √σn ) is called the lower confidence limit
CONFIDENCE INTERVALS AND
SAMPLE SIZE FOR PROPORTION
2
(LCL)
x+za
( )
σ
√n
is called the upper confidence limit
Population can be taken from
population or samples.
2
The following symbols will be used to
(LCL)
determine the confidence intervals
B. Sample Size Determination and sample size for proportions.
where:
𝒑 = population proportion
Sample size determination is very
^p = sample proportion
much related to estimation.
𝑿 = number of sample units that
To get an accurate estimate we need
posses by the characteristics of
three things: the maximum error of
interest
estimate, the population standard
𝒏 = sample size
deviation, and the degree of
confidence.
Sample Proportion:
The formula for the sample size is
derives from the maximum error of
estimate formula: X n− X
^p= q^ =
n n
E=z a
2
( √σn ) q^ =1− ^p
Formula for n: A. Confidence Intervals
To construct a confidence interval
observed data could be due to
about a proportion, one must use
chance, typically by calculating a p-
the maximum error of estimate,
value. If the p-value was below a
which is:
certain threshold (usually 0.05), the
null hypothesis would be rejected.
Fisher’s method focused on testing
√ ^p q^ whether experimental results were
E=z a statistically significant.
2 n
Jerzy Neyman and Egon Pearson
Confidence intervals about the
Neyman and Pearson extended
that 𝒏𝒑 ≥ 𝟓 and 𝒏𝒒 ≥ 𝟓:
proportions must meet the criteria
Fishers’s work by formulating the
Neyman-Pearson Lemma in the
1930s.
^p−z a
2 √ ^p q^
n
< p< ^p + z a
2
^p q^
n √ alternative hypotheses 𝑯1 and
They introduced the concept of
emphasized the need to consider
both Type I errors (false positives)
B. Sample Size for Proportions and Type II errors (false
negatives)
By using the maximum error part Their work laid the foundation for
of the confidence interval formula, the Neyman-Pearson framework,
it is possible to determine the size which includes the concepts of
order to estimate 𝒑 with a desired
of the sample that must be taken in critical regions, power of a test,
and accepting or rejecting
accuracy. hypotheses based on error
The maximum error of estimate for probabilities.
proportion can be expressed as: The Neyman-Pearson approach
provided a more formal structure
√ ( √ )
^p q^ ^p q^
2 for hypothesis testing, focusing on
E=z a → E= z a making decisions and balancing
2
n 2
n
error rates.
()
2
za Karl Pearson
2
→ n= ^p q^ Made significant contributions to
E
statistics, but he is not directly
credited with introducing
When applying the formula, we hypothesis testing.
must decide how accurate our His work laid the groundwork for
answer must be. modern statistical techniques,
We need to set up the level of which influenced later development
confidence we wish to work with. If in hypothesis testing.
value of 𝒑, we will apply it for
there is an indication of the
𝒑 and 𝒒 = 𝟏 − 𝒑. 𝝌𝟐 𝒕𝒆𝒔𝒕
PEARSON’S CHI-SQUARE TEST
Otherwise we will assign 𝟎. 𝟓 for
the value of 𝒑 and it will obtain the introducing the chi-square 𝝌𝟐 test
Karl Pearson is best known for
largest possible sample size. in 1900, one of the first formal
CHAPTER 8 tests of statistical significance.
Hypothesis Testing This test is used to determine
was introduced by Sir Ronald A. whether there is a significant
Fisher, Jerzy Neyman, Karl difference between the expected
Pearson and Egon Pearson and observed frequencies in
(Karl Pearson’s son) categorical data.
The chi-square test is one of the
Ronald A. Fisher earliest examples of hypotheses
Fisher is often credited with testing, though Pearson’s focus
pioneering modern statistical was more on the goodness-of-fit
hypothesis testing in the Early 20th rather that formalizing the broader
century. concept of hypothesis testing.
hypothesis 𝑯𝟎 and the idea of
He introduced the null CORRELATION AND REGRESSION
Pearson developed the Pearson
significance testing in his 1925 correlation coefficient, was
book Statistical Methods for measured of the linear relationship
Research Workers. between two variables, which is
His approach involved testing a null still widely used today.
hypothesis to see whether the
He also worked extensively on significance is the maximum
Type I error. That is, 𝑃 𝑇𝑦𝑝𝑒 𝐼
regression analysis, helping to probability of committing a
𝑒𝑟𝑟𝑜𝑟 = 𝛼. This probability is
develop methods for predicting and
symbolized by 𝛼 (Greek letter
modeling relationships between
variables
alpha).
BIOMETRIKA:
Pearson founded the journal
Critical Value
Biometrika, which became a major
Threshold separating critical and
platform for disseminating new
noncritical regions
statistical theories and techniques,
Cutoff that determines whether
contributing indirectly to the
to reject the null hypothesis
development of hypothesis testing
by fostering a community for Critical Region
leads to rejecting 𝐻0
statisticians. Range where the test statistic
While Pearson’s work focused on
statistical theory and methods, it If the test statistic falls here,
was his successors, like Ronald A. reject the null hypothesis
Fisher, who would go on to
formalize the concepts of Noncritical Region
supports retaining 𝐻0
hypothesis testing that are widely Range where the test statistic
used today.
If the test statistics falls here,
PROCEDURE IN HYPOTHESIS fail to reject the null hypothesis
TESTING
All hypothesis testing situations start C. ONE-TAILED VERSUS TWO-TAILED
with stating the statistical hypothesis. A TEST
statistical hypothesis is a conjecture A one-tailed test shows that the
about the population parameter. This null hypothesis be rejected when
conjecture may or may not be true. test value is in the critical region on
one side of the mean. If may be
A. TWO TYPES OF STATISTICAL either a right-tailed test or left-
HYPOTHESES tailed test, depending on the
Symbolized by 𝑯0
1. Null Hypotheses direction of the inequality of the
alternative hypothesis.
Is a statistical hypothesis testing On the other hand, two-tailed
that assumes that the test, the null hypothesis should be
observation is due to a chance rejected when the test value is in
factor. either of the two critical regions.
hypothesis is denoted by; 𝑯0:
In hypothesis testing, null
Two- Left- Right-
𝝁1 = 𝝁2, which shows that there
Concep
Tailed Tailed Tailed
t
𝐻0: 𝜇 = 𝐻0: 𝜇 =
Test Test Test
is no difference between the two
k k
𝐻0: 𝜇 =
population means (or
Or Or
Signs
𝑘
parameters)
the 𝑯0 𝐻0: μ ≥ 𝐻0: μ ≥
in
symbolized by 𝑯1,
2. Alternative Hypotheses
k k
𝐻1: 𝜇 ≠ 𝐻1: μ < 𝐻1: μ >
is the opposite of the null
Signs
𝑘 k k
hypothesis;
𝑯1
it shows that the observations in the
is denoted by; 𝑯1: 𝝁1 ≠ 𝝁2.
are the result of a real effect. It
Rejecti
In both In the In the
It states that there is a on
tail left tail right tail
difference between two Region
populations mean (or
parameters). Is equal to
Is the same as
B. LEVEL OF SIGNIFICANCE =
Is exactly the
In hypothesis testing, the level of same as
significance refers to the degree Is not equal to
of significance in which we accept ≠ Is not the same
or reject the null hypothesis. Is different from
In hypothesis testing, 100% Is decreased
accuracy is not possible for < Is less than
accepting or rejecting a null Is lower than
hypothesis. So, we therefore select Is Increased
a level of significance that is > Is greater than
usually 1% and 5%. Level of Is higher than
Is atleast particular sample statistic or a
occurring if the 𝐻0 is true.
Is not less than more extreme sample statistic
≥
Is greater than
or equal to
1. State the null hypothesis H0 and
Is at most The Step For P – Value Method Are:
the alternative hypothesis H1.
Is not more
2. Choose the level of significance, 𝛼,
≤ than
Is less than or
equal to and the sample size.
3. Determine the test statistic and
D. THE CRITICAL VALUE APPROACH sampling distribution.
TO HYPOTHESIS TESTING 4. Compute the test value.
5. Determine the p-value.
The observed value of the statistic 6. Make a statistical decision.
(sample observation) is compared 7. State the conclusion.
to critical values (population There is a different approach in the
observation). decision rule when using a p-value
o These critical values are method.:
If p-𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼, reject the H0, and if
expressed as standard z values.
p-𝑣𝑎𝑙𝑢𝑒 > 𝛼, do not reject the 𝐻0.
o For instance, if we use a level of
significance of 0.05, the size of
the rejection is 0.05.
There are also some important
If the test is two tailed; the guidelines for p-values.:
If p-𝑣𝑎𝑙𝑢𝑒 ≤ 0.01, reject the 𝐻0, thus
rejection region is divided into two
equal parts (i.e. we divided 0.05
into two equal parts of 0.025 each). difference is highly significant.
0.05 reject the 𝐻0, thus different is
o A rejection region of 0.025 in If p-𝑣𝑎𝑙𝑢𝑒 > 0.01, and p-𝑣𝑎𝑙𝑢𝑒 ≤
each tail of the normal
distribution results in the significant.
cumulative area of 0.025 below If p-𝑣𝑎𝑙𝑢𝑒 > 0.05, and p-𝑣𝑎𝑙𝑢𝑒 ≤
I error before rejecting the H0, thus
the critical value in the left tail 0.10 consider a consequences of Type
and a cumulative area of 0.025
above the upper critical value of difference is significant
𝐻0, thus difference is not significant.
the right tail. If p-𝑣𝑎𝑙𝑢𝑒 > 0.10, do not reject the
TYPE I ERROR
Occurs if one rejects the null
hypothesis when it is true. In ONE SAMPLE Z – TEST
denoted by alpha 𝛼 . In hypothesis
hypothesis testing, type I error is
One sample z test
testing, the normal curve that is a statistical test for the mean of a
It is used when 𝑛 ≥ 30, when the
shows the critical region is called population.
the alpha region.
population is normally distributed and
TYPE II ERROR population standard deviation is
Occurs if one does not reject the known.
null hypothesis when it is false. In
are denoted by beta 𝛽 . In
hypothesis testing, the type II error The formula for the z test is:
hypothesis testing, the normal observed value−expected value
Total Value=
curve that shows the acceptance Standard Error
region is called the beta region.
X−μ X−μ
z= ∨z=
σ S
HYPOTHESIS TESTING USING P- √n √n
VALUE
𝒘𝒉𝒆𝒓𝒆:
𝒛 = one sample z test
P-value (Or Probability Value)
is the probability of getting sample
𝝁 = population mean
statistic or a mean extreme sample X = sample mean
when the 𝐻0 is true. 𝝈 = population standard mean
statistic in the direction of the H1
𝒔 = sample standard deviation
𝒏 = number of observation in the sample
We can also say that p-value is the
actual area under the standard
normal distribution curve
representing the probability of a
ASSUMPTION IN ONE SAMPLE Z – 4. Sample size should be less than 30.
TEST 5. The population means should be
1. Subjects are randomly selected. known.
2. Population distribution is normal.
3. The population should be known. PROCEDURE OF ONE SAMPLE T –
4. Cases of the samples should be TEST:
independent. 1. Set up the hypothesis:
5. Sample size should be greater than H0: µ = specified valued
or equal to 30. H1: µ ≠, <, > specified
valued
PROCEDURE OF ONE SAMPLE Z – 2. Set the level of significance,
TEST:
determine the critical values of 𝑡.
calculate the degree of freedom &
1. Set up the hypothesis:
H0: µ = specified valued 3. Calculate the sample mean and
H1: µ ≠, <, > specified sample standard deviation for one
valued sample t test:
determine the critical value of 𝑧.
2. Set the level of significance and
√
( ∑ x )2
sample 𝑧 test by using formula:
2
3. Calculate the sample mean for one ∑x −
∑x n
X= S=
∑x n n−1
X=
n where:
𝑠 = sample standard deviation
where: X = sample mean
𝑛 = number of observation in the
X = sample mean
𝑛 = number of observation in the
sample. sample.
sample 𝑡 −test, by using the
4. Calculate the value of the one
sample 𝑧 test, by using the formula
4. Calculate the value of the one
if 𝜎 is known, otherwise use the
formula :
formula: X−μ
X−μ X−μ t=
z= ∨z= S
σ S √n
√n √n
𝑡 = one sample t-test
5. Statistical decision for hypothesis where:
testing:
If 𝑧computed < zcritical, do not reject H0 𝜇 = population mean
X = sample mean
If 𝑧computed ≥ zcritical, reject H0 𝑠 = sample standard deviation
6. State the conclusion. 𝑛 = number of observation in the
sample
ONE SAMPLE T TEST 5. Statistical decision for hypothesis
testing:
If tcomputed < tcritical, do not reject H0
One Sample T-test
is a statistical procedure that is
used to know the mean difference If tcomputed ≥ tcritical, reject H0
between the sample and the known
value of the population mean. 6. State the conclusion.
We draw a random sample from
the population and then complete
the sample mean with the
population mean and make a
statistical decision as to whether or
not the sample mean is different
from the population. The sample
size should be less than 30.
ASSUMPTION IN ONE SAMPLE T –
TEST
1. The population must be
approximately normally distributed.
2. Samples drawn from the population
should be random.
3. Cases of the samples should be
independent.