CHAPTER 6
ESTIMATION OF THE MEAN
AND PROPORTION IN ONE
POPULATION
6.1 ESTIMATION
p Introduction
p Point estimation
p Interval estimation
Introduction
Definition
The assignment of value(s) to a population parameter
based on a value of the corresponding sample statistic is
called estimation.
The value(s) assigned to a population parameter based on
the value of a sample statistic is called an estimate.
The sample statistic used to estimate a population
parameter is called an estimator.
Introduction
The estimation procedure involves the
following steps.
n Draw a sample.
n Collect the required information from the
members of the sample.
n Calculate the value of the sample statistic.
n Assign value(s) to the corresponding
population parameter.
Point Estimation
Definition
Based on the sample data, a single number is calculated to estimate the
population parameter. The rule or formula that describes this calculation is
called the point estimator, and the resulting number is called a point
estimate.
An estimator is said to be unbiased if the mean of its distribution is equal to
the true value of the parameter being estimated. Otherwise, the estimator is
said to be biased .
Error of estimation
p Definition:
The distance between an estimate and the true value of the
parameter is called the error of estimation.
Note: For unbiased estimators, this implies that the difference
between the point estimator and the true value of the parameter
will be less than 1.96 standard deviations or 1.96 standard errors
(SE). This quantity, called the 95% margin of error (or simply
the “margin of error”), provides a practical upper bound for the
error of estimation.
95% margin of error = 1.96 * Standard error
Figure: Sampling distribution of an unbiased estimator
Interval Estimation
In interval estimation, an interval is constructed around the point estimate,
and it is stated that this interval is likely to contain the corresponding
population parameter.
Confidence Interval
p Definition
p Each interval is constructed with regard to a given
confidence level and is called a confidence
interval. The confidence level is given as
Point estimate ± Margin of error
p The confidence level associated with a confidence
interval states how much confidence we have that
this interval contains the true population parameter.
The confidence level is denoted by (1 – α)100%.
6.2 Estimation of a Population Mean: σ
known
p Three Possible Cases with σ known
p Confidence Interval for a Population Mean
µ : σ known
p Interpretation of the CI
p Control width of CI
p Determining n given width of CI
Three Possible Cases with σ known
Confidence Interval for a Population Mean μ : σ
known
Confidence Interval for μ
The (1 – a)100% confidence interval for μ
under Cases I and II is
x ± zα /2σ x
where σ x =σ / n
The value of za/2 used here is obtained from the
standard normal distribution table with upper tail
area a/2 (or lower tail area 1 – a/2).
Confidence Interval for a Population Mean μ : σ
known
Definition
The margin of error for the estimate
for μ, denoted by E, is the quantity that is
subtracted from and added to the value of
x to obtain a confidence interval for μ.
Thus,
E = zα /2σ x
Figure: Area in the tails.
Table: z Values for Commonly Used Confidence
Levels
Example
A publishing company has just published a new college
textbook. Before the company decides the price at which to
sell this textbook, it wants to know the average price of all
such textbooks in the market. The research department at
the company took a sample of 40 comparable textbooks and
collected information on their prices. This information
produces a mean price of $145 for this sample. It is known
that the standard deviation of the prices of all such textbooks
is $35 and the population of such prices is normal.
(a) What is the point estimate of the mean price of all such
textbooks?
(b) Construct a 90% confidence interval for the mean price of
all such college textbooks.
Solution
a)
Solution
b) Confidence level is 90% or .90. Here, the area in
each tail of the normal distribution curve is
α/2=(1-.90)/2=.05. Hence, z = 1.65.
= 145 1.65(5.534)
=(145 – 9.131) to (145 + 9.131)
We can say that we are 90% confident that the mean
price of all such college textbooks is between
$135.869 and $154.131.
Interpretation of the CI
If we are going to construct many other CIs using
the exactly same method, we are confident that
90% of the CIs will cover the true mean prices of
such textbooks.
Example
The weight of grade A large eggs sold in Canada is
Normally distributed with a standard deviation of 5
grams. A package of 12 grade A large eggs is purchased
and has a total weight of 771 grams. Assuming the 12
eggs are from a Simple Random Sample, find a 95%
confidence interval for the true mean weight of grade
A large eggs sold in Canada.
Remarks on CI
p Confidence level means the coverage rate
under repeated sampling
p Confidence level isn’t our posterior belief.
p Confidence interval isn’t prediction interval
for a single data point.
p The validity of CI for mean parameters
still holds even when the population
distribution isn’t normal, as long as
sample size is sufficiently large.
Control width of CI
The width of a confidence interval
depends on the size of the margin of
error, zα /2σ x . Hence, the width of a
confidence interval can be controlled using
1. The value of z, which depends on the
confidence interval
2. The sample size, n
Determining n given width of CI
Given the confidence level and the standard
deviation of the population, the sample size that
will produce a predetermined margin of error E of
the confidence interval estimate of µ is
2 2
zσ
n= α /2
2
E
Example:
An alumni association wants to estimate
the mean debt of this year’s college
graduates. It is known that the population
standard deviation of the debts of this
year’s college graduates is $11,800. How
large a sample should be selected so that
the estimate with a 99% confidence level is
within $800 of the population mean?
Solution
p The maximum size of the margin of error of
estimate is to be $800; that is, E = $800.
p The value of z for a 99% confidence level is z =
2.58.
p The value of σ is $11,800.
! !! ! "!#$%&! "''(%))&!
"= !
= !
= '**%#'% " '**+
# "%))&
p Thus, the required sample size is 1449.
Example:
How many female students should be sampled in order
to estimate the true mean height of all U of S female
students within 0.5 inches of its true value with a 96%
confidence interval? The population standard deviation
of the height of female students in the U of S is 4
inches.
6.3 Confidence Interval for a Population
Mean μ : σ unknown
p Three Possible Cases with σ unknown
p The t Distribution
p Confidence Interval for µ Using the t
Distribution
Three Possible Cases with σ unknown
The t Distribution
The t distribution is a specific type of bell-
shaped distribution with a lower height and
a wider spread than the standard normal
distribution. As the sample size becomes
larger, the t distribution approaches the
standard normal distribution. The t
distribution has only one parameter, called
the degrees of freedom (df). The mean of
the t distribution is equal to 0 and its
standard deviation is
Figure: The t distribution for df = 9 and the
standard normal distribution
Example
Find the value of t for 16 degrees of
freedom and .05 area in the right tail of a t
distribution curve.
Figure: The value of t for 16 df and .05 area
in the right and left tail.
Confidence Interval for μ Using the t
Distribution
Example
Dr. Moore wanted to estimate the mean cholesterol level for
all adult men living in Hartford. He took a sample of 25 adult
men from Hartford and found that the mean cholesterol level
for this sample is 186 mg/dL with a standard deviation of 12
mg/dL. Assume that the cholesterol levels for all adult men in
Hartford are (approximately) normally distributed. Construct a
95% confidence interval for the population mean µ.
Solution
Solution
p Thus, we can state with 95% confidence that the mean
cholesterol level for all adult men living in Harford lies
between 181.05 and 190.95 mg/dL.
Example
Sixty-four randomly selected adults who buy books for general
reading were asked how much they usually spend on books
per year. The sample produced a mean of $1450 and a
standard deviation of $300 for such annual expenses.
Determine a 99% confidence interval for the corresponding
population mean.
Solution
Thus, we can state with 99% confidence that based on this
sample the mean annual expenditure on books by all adults
who buy books for general reading is between $1350.40 and
$1549.60.
What If the Sample Size Is Too Large
1. Use the t value from the last row (the
row of ∞) in t Table.
2. Use the normal distribution as an
approximation to the t distribution.
6.4 Estimation of a Population Proportion:
large Sample
p Estimator of the Standard Deviation
p Confidence Interval for a Population Proportion
p Determining The Sample Size For The Estimation of
Proportion
Estimator of the Standard Deviation of !
The value of " ! , which gives a point
estimate of ! ! , is calculated as follows.
Here, " ! is an estimator of ! !
#! "!
$ #! =
!
Confidence Interval for a Population Proportion
The (1 – α)100% confidence interval for the
population proportion, p, is
! ± "# !
The value of z used here is obtained from the
standard normal distribution table for the given
confidence level, and % $! = $! #!!". The term "# !!
is called the margin of error, E.
Example:
According to a survey by Pew Research Center in
June 2009, 44% of people aged 18 to 29 years said
that religion is very important to them. Suppose
this result is based on a sample of 1000 people
aged 18 to 29 years.
a) What is the point estimate of the population
proportion?
b) Find, with a 99% confidence level, the percentage
of all people aged 18 to 29 years who will say that
religion is very important to them. What is the
margin of error of this estimate?
Solution
p n = 1000, ! = .44, and, !! = .56
!!
!" "#$$%"#&'%
p # !! = = = #()&'*+),
$ )(((
p Note that "! and "!! are both greater
than 5.
Solution
a)Point estimate of p = ! = .44
b) The confidence level is 99%, or .99. z = 2.58.
!! ± "# !! = "## ± $"%&'"()%*+,)-. = "## ± "(#
///////////// = /"#(//01//"#&///12///#(3//01/#&3
Margin of error = ±1.96 " !
= ±1.96(.01569713)
= ±.04 or ±4%
Example
If an SRS of 𝑛=200 students is obtained and, of these,
𝑥=168 own their own computers, what is a 96%
confidence interval for the true percentage of
students who own their own computer?
Determining The Sample Size For The Estimation
of Proportion
Given the confidence level and the values
of !! and !! , the sample size that will
produce a predetermined maximum of
error E of the confidence interval
estimate of p is
""
! "# !
$= !
%
Determining The Sample Size For The Estimation
of Proportion
In case the values of !! and !! are not known
1. We make the most conservative estimate
of the sample size n by using !! = "# and
!! = "#
2. We take a preliminary sample (of
arbitrarily determined size) and calculate
!! and !! from this sample. Then use
these values to find n.
Example
Lombard Electronics Company has just installed a
new machine that makes a part that is used in
clocks. The company wants to estimate the
proportion of these parts produced by this
machine that are defective. The company
manager wants this estimate to be within .02 of
the population proportion for a 95% confidence
level. What is the most conservative estimate of
the sample size that will limit the maximum error
to within .02 of the population proportion?
Solution
p The value of z for a 95% confidence level
is 1.96.
p !! = "#$%%"#$ %%%! = "#$
p " " #$%&'(! #%)*(#%)*(
! ! "#
$= !
= !
= !+*$
% #%*!(
p Thus, if the company takes a sample of
2401 parts, there is a 95% chance that
the estimate of p will be within .02 of the
population proportion.
Example:
If an SRS of 𝑛=200 students is obtained and, of these,
𝑥=168 own their own computers.
How many male students should be sampled to be
98% confident that the estimate of the proportion of
male students who own their own computer is within
0.025 of the true value?