Topic four:
SAMPLING DISTRIBUTIONS
January 22, 2025 Prepared by D. Ringo 1
Introduction
A sampling distribution is a theoretical or hypothetical distribution that
describes the likelihood of obtaining various sample statistics from a
population.
It is derived from repeatedly sampling from the same population and
calculating a particular statistic for each sample.
In statistics, one common example is the sampling distribution of the
mean.
If you were to take multiple random samples from a population and
calculate the mean of each sample, the sampling distribution of the mean
would show the distribution of those sample means.
January 22, 2025 Prepared by D. Ringo 2
Sampling/ Standard Error
Sampling error is the difference between a sample statistic and the
corresponding population parameter.
It occurs because we are often unable to collect data from an entire
population and must rely on a sample.
Since a sample is only a subset of the population, the characteristics of
the sample may not perfectly represent the characteristics of the entire
population.
Sampling error can be influenced by various factors, including the size of
the sample and the variability within the population.
It is a natural part of the sampling process and is expected to some extent
in all research.
January 22, 2025 Prepared by D. Ringo 3
Sampling/ Standard Error
It's important to note that sampling error is not the result of mistakes or
errors in the sampling process but rather reflects the inherent variability
between samples and populations.
To mitigate sampling error, researchers often use techniques like random
sampling and larger sample sizes, which can help improve the accuracy of
estimates and reduce the impact of sampling error on statistical
inferences.
The sampling error is calculated by dividing the standard deviation of
the population by the square root of the size of the sample (n) and then
multiplying the resultant with the Z-score value, which is based on the
confidence interval.
January 22, 2025 Prepared by D. Ringo 4
Sampling/ Standard Error
January 22, 2025 Prepared by D. Ringo 5
Solved Example
Suppose that the population standard deviation is 0.40 and the size of the
sample is 2500 then find the sampling error at 95% confidence level.
Solution:
From the given data,
σ = 0.40
Sample size = n = 2500
Value of z at 95% of confidence level = 1.96
Sampling error = z × σ/√n
= 1.96 × 0.40/√(2500)
= 1.96 × 0.40/50
= 0.01568
January 22, 2025 Prepared by D. Ringo 6
Sampling Proportions
January 22, 2025 Prepared by D. Ringo 7
Standard Error of Sampling Proportions
January 22, 2025 Prepared by D. Ringo 8
Central Limit Theorem
The Central Limit Theorem is a fundamental concept in statistics that
describes the shape of the sampling distribution of the sample mean (or
other sample statistics) when drawing repeated random samples from a
population, regardless of the shape of the population distribution.
The Central Limit Theorem states that, as the sample size increases, the
sampling distribution of the sample mean becomes approximately
normally distributed, regardless of the shape of the original population
distribution. This holds true under certain conditions, even if the
population distribution is not normal.
This means that when sample size (n) ≥ 30 then sampling distribution of
mean
( )
2
𝜎
𝑋 ∼ 𝑁 𝜇 ,
𝑛
January 22, 2025 Prepared by D. Ringo 9
Converting Sampling Distributions to Standard
Normal Distribution
From
Then X is replaced by sample mean and with standard error
That is,
Make μ the subject of the formula;
January 22, 2025 Prepared by D. Ringo 10
Reading Task
Discuss basic two types of sampling techniques and in each type
mention and explain the methods available.
January 22, 2025 Prepared by D. Ringo 11
Topic FIVE:
ESTIMATION THEORY
January 22, 2025 Prepared by D. Ringo 12
Introduction…..
In statistical inference, we are interested in knowing the values of
population parameters like population mean, variance and standard
deviation.
Since conducting a full enumeration to determine the values of the
parameters we need is difficult, we typically estimate these values using
sample statistics.
Sample statistic or simply statistic is a measurement computed from the
sample. Which is a small part of the population.
We prefer selecting a sample because it is more cost-effective, faster, and
offers a broader scope. It is also statistically valid to estimate parameters
using statistics.
What is important is to ensure randomness of the samples, so that every
individual in the population is given an equal chance of being included in the
sample.
January 22, 2025 Prepared by D. Ringo 13
Introduction…..
When a statistic is used to estimate a population parameter it is then
called an estimator.
Thus, an estimator is a statistical method or rule that is used to make an
educated guess or estimate about a parameter in a statistical model.
Parameters are numerical characteristics of a population, and the goal of
estimation is to infer these parameters based on information collected
from a sample.
However, not any statistic can be used as an estimator for some
population parameter. There are criteria that are used to judge good
estimator.
January 22, 2025 Prepared by D. Ringo 14
Criteria for a good estimator
1. Efficiency in estimation theory refers to an estimator having a relatively
small variance or standard deviation. When comparing two estimators, it
is advisable to choose the one with a smaller standard deviation. This
criterion ensures that the selected estimator is more precise and less
likely to deviate from the true value.
2. Sufficiency in estimation theory refers to an estimator that effectively
utilizes all the information available in the sample to arrive at an
estimate. For example, when estimating a population mean, a sample
mean is preferred over a sample median or mode. The latter two
estimators utilize only partial information, whereas a sample mean
incorporates the entire dataset for a more comprehensive estimate.
January 22, 2025 Prepared by D. Ringo 15
Criteria for a good estimator
3. Consistency, in estimation theory, a consistent estimator is one for which
the standard deviation decreases as the sample size increases. This
implies that, with a consistent estimator, an increase in sample size is
expected to result in a decrease in standard deviation.
4. Unbiasedness in estimation theory refers to an estimator whose
expected value (mean) precisely equals the parameter being estimated.
In other words, an estimator is unbiased if, on average, it provides an
estimate equal to the true parameter value
January 22, 2025 Prepared by D. Ringo 16
Important Symbols and Terms
population mean
Sample mean
Population standard deviation
Sample standard deviation
Sample proportion
Population proportion
number of observation of interest
January 22, 2025 Prepared by D. Ringo 17
Interval Estimation
In interval estimation, we estimate a range believed to contain the value
of the parameter we seek. Along with this range, we specify the level of
confidence in the belief that the parameter indeed falls within the
estimated interval.
To understand interval estimation, the knowledge of sampling distribution
is crucial.
Recall,
However, the value of Z-scores should be read with the certain confidence
level.
Thus,
January 22, 2025 Prepared by D. Ringo 18
Interval Estimation
It should be noted that;
or error probability
level of confidence
Therefore, are the error probabilities that remain on either tail of the
normal distribution.
We normally place 90%,95% and 99% confidence intervals for population
mean as well as population proportion. The values of at these confidence
levels are:
Confidence level 𝑍 𝛼⁄2
90% 1.645
95% 1.96
99% 2.58
January 22, 2025 Prepared by D. Ringo 19
Worked Example
A mining company in Zambia needs to estimate the average amount of
copper ore per ton mined. A random sample of 50 tons gives a sample
mean of 146.75kg. The population standard deviation is assumed to be
35.2kg.
REQUIRED:
a) Provide a 90% CI for the average amount of copper in the
population of tons mined.
b) Provide 95% and 99% CI for the average amount of copper per ton.
January 22, 2025 Prepared by D. Ringo 20
CI for when is unknown and the sample is large
Under normal circumstances, it is not possible to estimate both the
population mean and the population standard deviation simultaneously
using the same sample.
Therefore, the first step is to estimate the standard deviation from the
for the population mean, 𝜇 using the formula;
sample, and then use the given formula to estimate the confidence interval
January 22, 2025 Prepared by D. Ringo 21
Worked Example
MTANDAO wants to estimate the average length of long-distance calls
during weekends. A random sample of 50 calls gives a mean of 14.5
minutes and standard deviation of 5.6 minutes. Provide a 95% CI for
the average length of a long-distance phone call during weekends.
January 22, 2025 Prepared by D. Ringo 22
When 𝜎 is unknown, it can be mathematically demonstrated that the
Small Sample CI for the Mean, when is unknown
conversion formula for the sample mean shown previously no longer
In fact, when 𝜎 is unknown, the sampling distribution for the sample
follows a normal distribution.
mean follows a new distribution called student’s t-distribution.
This distribution shares properties with the normal distribution, except
that it is not characterized by the mean and standard deviation, as the
normal distribution is.
Instead, the t-distribution is being described by a parameter called
degrees of freedom (df).
Df are the number of observations in a sample that are free to assume any
value.
January 22, 2025 Prepared by D. Ringo 23
Small Sample CI for the Mean, when is unknown
For a sample size n, degrees of freedom (df) are given by n-1.
This distribution therefore caters for small samples when is not known.
Under such situation we define the CI for as:
Example
A management consulting firm needs to estimate the average number of
years of experience of executives in a given brand of management. A
random sample of 25 executives give a mean of 6.7 years and standard
deviation 2.4 years. Give a 95% CI for the average number of years of
experience for all executives in this branch.
January 22, 2025 Prepared by D. Ringo 24
Large Sample CI for the Population Proportion
Recall,
Then the CI for the population proportion given a large sample will be:
Example
The makers of a medicated facial skin cream are interested in
determining the percentage people in a given age group who may
benefit from the ointment. A random sample 68 people results in 42
successful treatments. Construct a 95% CI for the proportion of people
in the given age group who may be successfully treated with the facial
cream.
January 22, 2025 Prepared by D. Ringo 25
Finite Population Correction Factor
The formula for the standard error both for the sample mean and the
sample proportion given previously do not take into account what we call
consistency.
They indicate that the sample proportion exists even if a full remuneration
or census takes place.
To take care of consistency, we have to introduce what is known as a finite
population correction factor.
The factor keeps on minimizing the standard error as the sample size
increases.
The factor is given by:
January 22, 2025 Prepared by D. Ringo 26
Finite Population Correction Factor
However, we use this factor when the sample size is large otherwise there
will not be a significant change in the value of the standard error.
The rule is to use the FPCF when the ratio of sample size to population
size is at least 5%.
Thus, CI for is:
And CI for P with FPCF is:
January 22, 2025 Prepared by D. Ringo 27
Worked Examples
Example
A certain branch of a large bank has 1253 checking accounts. A random
sample of 200 of these accounts was selected. And the average balance in
the sample was found to be TZS 64,832. The sample standard deviation was
found to be TZS 21,000. Place a 99% for the average balance in a checking
account at the branch.
Example
The marketing department of National Bicycle Company conducts a
research to estimate the share foreign made bicycles have in the Tanzania
market. Out of estimated 1000 bicycles users in Chamwino district, a
random sample of 100 was obtained. From the sample, it was found that 34
people are users of foreign made bicycles; the rest are users of locally made
ones. Construct a 95% CI for the share of foreign made bicycles in our
market.
January 22, 2025 Prepared by D. Ringo 28
Estimation of Sample Size
One of the crucial problems when conducting a study is to determine
the size of the sample.
This is important because if inadequate sample size is used, the
results obtained will not be acceptable since it is based on a sample
that is not a true representative of the population.
In order to have a valid results, we have to scientifically estimate the
sample size (n).
From the margin of error (E) formula:
Thus,
January 22, 2025 Prepared by D. Ringo 29
Worked Example
A marketing research firm aims to conduct a survey to estimate the average
amount spent on entertainment by each person visiting a popular resort.
The survey planners seek to determine the average expenditure by all
visitors to the resort within a margin of TZS 2,000, with 95% confidence.
Based on the past operation of the resort, an estimate of the population
standard deviation is TZS 4,000. What is the minimum required sample
size?
January 22, 2025 Prepared by D. Ringo 30
Estimating the Sample Size when Placing C.I for P
Example
Find the minimum required sample size of accounts if the proportion of
accounts in error is to be estimated within 0.02 with 95% confidence. A
rough estimate of the proportion of accounts in error is 0.10.
January 22, 2025 Prepared by D. Ringo 31
January 22, 2025 Prepared by D. Ringo 32