0% found this document useful (0 votes)
31 views41 pages

Estimation & Confidence Intervals Guide

The document provides an overview of estimation and confidence intervals in statistics, focusing on point and interval estimation methods. It explains the concepts of target parameters, confidence levels, and the conditions for constructing confidence intervals for both known and unknown population standard deviations. Examples illustrate the application of these concepts in real-world scenarios, such as estimating student spending and analyzing pulse rates.

Uploaded by

kamranhanif3111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views41 pages

Estimation & Confidence Intervals Guide

The document provides an overview of estimation and confidence intervals in statistics, focusing on point and interval estimation methods. It explains the concepts of target parameters, confidence levels, and the conditions for constructing confidence intervals for both known and unknown population standard deviations. Examples illustrate the application of these concepts in real-world scenarios, such as estimating student spending and analyzing pulse rates.

Uploaded by

kamranhanif3111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ST11: Statistics & Probability

Estimation and Confidence


Intervals (chap 6)
This document belongs to ESCP Business School.
It cannot be modified nor distributed without the
author’s consent.
Prof. Lynn FARAH
Statistical Methods

Statistical
Methods

Descriptive Inferential
Statistics Statistics

Hypothesis
Estimation Testing
Thinking Challenge

Suppose you’re interested in estimating the average amount of


money that students at ESCP (the population) spend on food per
month.
How would you find out?
Estimation Methods

Estimation

Point Interval
Estimation Estimation
Target Parameter

The unknown population parameter (e.g., mean or proportion) that we


are interested in estimating is called the target parameter.

Parameter Key Words of Phrase Type of Data

µ Mean; average Quantitative

p Proportion; percentage
fraction; rate Qualitative
Point Estimator
A point estimator of a population parameter is a rule or formula
that tells us how to use the sample data to calculate a single number
that can be used as an estimate of the population parameter.

1. Provides a single value based on observations from one random


sample
2. Gives no information about how close the value is to the
unknown population parameter

Example: The sample mean for the amount of money spent on food
per month by a random sample of 200 ESCP students is 364€. This is a
point estimate of the unknown population mean.
Interval Estimator
An interval estimator (or confidence interval) is a formula that
tells us how to use the sample data to calculate an interval that estimates
the target parameter.
1. Provides a range of values based on observations from one random
sample
2. Takes into consideration variation in sample statistics from sample
to sample

3. Stated in terms of level of confidence, e.g. 95% confident or 99%


confident, can never be 100% confident.

Example: The sample mean for amount of money spent on food per
month by a random sample of 200 ESCP students lies between 359€
and 369€, with 95% confidence.
Estimation Process

Random Sample I am 95%


confident that
µ is between
Population 359 & 369.
Mean
(mean, µ, is X = 364
unknown)

Sample
9

Key Elements of Interval Estimation

Lower Upper
Confidence Confidence
Point Estimate
Limit Limit
Width of
confidence interval

Interval Estimation of sample mean for amount of money spent


on food per month by a random sample of 200 ESCP students:
Point estimate = 364
Lower limit = 359
Upper limit = 369
Width of interval = 10 = 2*5
General Formula

The general formula for all confidence intervals is:

Point Estimate ± (Critical Value)*(Standard Error)

Where:

•Point Estimate is the sample statistic estimating the population


parameter of interest

•Critical Value is a table value based on the sampling distribution


of the point estimate and the desired confidence level

•Standard Error is the standard deviation of the point estimate


(variation from sample to sample)
11

Sampling Theory

Population Samples

.
.
.
.

The value of a point estimate varies from sample to sample, and


so does the interval estimation built around it 11
Confidence Level
Confidence Level:

• the probability that a randomly selected confidence interval encloses


the population parameter

• a percentage (less than 100%)


Suppose confidence level = 95%
(also written (1 - a) = 0.95, called confidence coefficient,
so a = 0.05).
A relative frequency interpretation:
95% of all the confidence intervals that can be constructed using all
possible random samples will contain the unknown true parameter,
and 5% will be “duds”
Confidence Level

Lower Upper Contains


Sample # X
Limit Limit µ?
1 362.30 356.42 368.18 Yes

2 369.50 363.62 375.38 Yes

3 360.00 354.12 365.88 No

4 362.12 356.24 368.02 Yes

5 373.88 368.00 379.76 Yes

… … … … …
Confidence Level

µ
x1

x2

Confidence Intervals
Confidence Level
In practice you only take one sample of size n
In practice you do not know the population parameter so you do not
know if the interval actually contains it
However you do know that 95% of all the possible intervals formed in
this manner will contain the population parameter
Thus, based on the one sample you actually selected, you can be 95%
confident that your interval will contain population parameter

What about the confidence interval you construct?


You will never be able to check whether it captures the population
parameter or not
Confidence Intervals

Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown
Confidence Intervals

Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown
Confidence Interval for µ (σ Known)

Conditions:
• Population standard deviation σ is known
• Random sample
• Large sample size (n≥30), to ensure CLT works and we have an approximately
Normal sampling distribution for the mean
Confidence interval estimate (z-interval):
σ
x! ± z!/# ×
n

where
x! is the point estimate (sample mean)
z!/# is the Normal distribution critical value for a probability of α/2 in each tail
$
is the standard error
%
Finding the Critical Value, zα/2
Consider a 95% confidence interval:

Zα/2 = ±1.96

1 - α = 0.95 so α = 0.05

α α
= 0.025 = 0.025
2 2

Z units: Zα/2 = -1.96 0 Zα/2 = 1.96

X units: Lower Upper


Confidence Point Confidence
Limit Estimate Limit
Common Levels of Confidence
Most commonly used confidence levels are 90%, 95%, and 99%:

Confidence
Confidence Coefficient,
Level Zα/2 value
1- a
80% 0.80 1.28
90% 0.90 1.645
95% 0.95 1.96
98% 0.98 2.33
99% 0.99 2.576
99.8% 0.998 3.08
99.9% 0.999 3.27
Example: Electric Circuits
A sample of 41 circuits from a large normal population has a mean
resistance of 2.20 ohms. We know from past testing that the
population standard deviation is 0.35 ohms.

Determine a 95% confidence interval for the true mean resistance of


the population and interpret it.
We are 95% confident that the true
𝜎 mean resistance is between 2.09
𝑥̅ ± 𝑧!/#
𝑛 ohms and 2.31 ohms
0.35 Although the true mean may or may not
= 2.20 ± 1.96
41 be in this interval, 95% of intervals
= 2.20 ± 0.11 formed in this manner will contain the
= [2.09; 2.31] true mean
Thinking Challenge

You’re a Q/C inspector for Gallo


winery. The s for 2-liter bottles
is 0.05 liters.
A random sample of 100 bottles
showed a mean of 1.99 liters.
What is the 90% confidence
interval estimate of the true
mean amount in 2-liter bottles?
22 liter
liter
Confidence Intervals

Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown
Do You Ever Truly Know σ?

Probably not!
In virtually all real world (business) situations, σ is not known.

If there is a situation where σ is known then µ is also known


(since to calculate σ you need to know µ.)
If you truly know µ there would be no need to gather a sample
to estimate it.
Confidence Interval for µ (σ Unknown)
If the population standard deviation σ is unknown, we can substitute
the sample standard deviation, s

This introduces extra uncertainty, since s is also variable from


sample to sample.

So we use the Student t-distribution instead of the Normal


distribution.
Confidence Interval for µ (σ Unknown)

Conditions:
• Population standard deviation σ is unknown
• Random sample
• The variable has an approximately Normal distribution
Confidence interval estimate (t-interval):
𝑠
𝑥̅ ± 𝑡&/# ×
𝑛

where
𝑥̅ is the point estimate (sample mean)
𝑡&/# is the Student distribution critical value for a probability of 𝛼/2 in each tail
'
is the standard error
(
Student’s t Distribution
The t is a family of distributions
The tα/2 value depends on degrees of freedom (d.f.=n-1 in this case)

Standard
Normal
(t with df = ∞)

t (df = 13)
t-distributions are bell-
shaped and symmetric,
but have ‘fatter’ tails than t (df = 5)
the normal distribution

t
0
Note: t Z as n increases
Student’s t Table
Let: n = 5
df = n - 1 = 4

Let: a = 0.05
a/2 = 0.025

a/2 = 0.025

0 t
2.776
29

Example: Pulse rates


A medical researcher measured the pulse rates (in beats per minute,
written bpm) of a sample of 52 randomly selected adults.
a) Are the necessary conditions for a t-interval satisfied? Explain.
b) Find a 95% confidence interval for the mean pulse rate.
c) Explain the meaning of that interval.
d) What does “95% confidence” mean in this context?
30

Example: Pulse Rates - Solution


a) Are the necessary conditions for a t-interval satisfied? Explain.

- The population standard deviation is not given ✓


- The sample was randomly obtained ✓
- Looking at the histogram of pulse rates, we can see that the distribution of
the variable is approximately Normal ✓

b) Find a 95% confidence interval for the mean pulse rate.

The 95% CI will be


• Centred at 72.7 bpm
• With standard error 6.482/√52≈0.89 bpm
• Based on a t-distribution with 52-1=51 degrees of freedom
so using a table of critical values for t-distributions: t !/# ≈2.009

So the 95% CI for the mean pulse rate is 72.7±(2.009)*(0.89)


Þ from 70.91 to 74.48 bpm
Example: Pulse Rates - Solution
Answers to question parts c) and d):

c) Based on the data, we are 95% confident that the true mean pulse
rate of adults is between 70.91 and 74.48 bpm.

d) “95% confidence” means that 95% of all such random samples of size
n=52 will give an interval which does contain the true population mean
pulse rate.
What happens when n is large enough?

Confidence t t t z
Level (10 d.f.) (20 d.f.) (30 d.f.) (∞ d.f.)

0.80 1.372 1.325 1.310 1.28


0.90 1.812 1.725 1.697 1.645
0.95 2.228 2.086 2.042 1.96
0.99 3.169 2.845 2.750 2.58

We notice that t tends to z as n increases.


Hence, when s is unknown but n is large enough (n≥30), we can
replace t by z in the formula.
The confidence interval will approximately be equal to:
æ s ö
x ± za 2 ç
è n ÷ø
Example: back to Pulse Rates - Solution

Since n>30 in this example, we could have directly used a z-interval


instead of a t-interval (so replace 𝑡!/# by 𝑧!/# ).
The 95% CI will be
• Centred at 72.7 bpm
• With standard error 6.482/√52≈0.89 bpm
• Based on a z-distribution: z!/# ≈1.96

So the 95% CI for the mean pulse rate is 72.7±(1.96)(0.89)


Þ from 70.96 to 74.44 bpm (small difference compared to
values with t-distribution)

As compared to the 95% CI for the mean pulse rate using the exact t-value
72.7±(2.009)*(0.89) => from 70.91 to 74.48 bpm
Thinking Challenge

You’re a time study analyst in


manufacturing.
You’ve recorded the following
task times (min.):
3.6, 4.2, 4.0, 3.5, 3.8, 3.1.
What is the 90% confidence
interval estimate of the
population mean task time?
Confidence Intervals

Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown
Confidence Interval for p
Conditions:
• Random sample
• Large sample size (when n2p ≥ 15 and n(1 − p2 ) ≥ 15 to ensure we have an
approximately Normal sampling distribution for the proportion)
Confidence interval estimate (z-interval):

p2 (1 − p2 )
p2 ± z!/# ×
n

where
p2 is the point estimate (sample proportion)
z!/# is the normal distribution critical value for a probability of α/2 in each tail

)(,-)
* *)
is the standard error
%
37

Example: Delanoë
In April 2010, a survey was made by IPSOS (a French market research company)
about the satisfaction rating of Bertrand Delanoë, mayor of Paris at the time.
A representative sample of 1011 voters responded to whether they were
satisfied or not with the job he was doing as mayor of Paris; 53% responded that
they were satisfied.
Check that the conditions are met then find and interpret a 95% confidence
interval for Delanoë’s satisfaction rate.
Conditions:
The sample has been constructed in a representative way (replaces randomness)
n2p = 1011 0.53 = 536 ≥ 15 and n 1 − p2 = 1011 0.47 = 475 ≥ 15 so
the sample is “big enough”.
Using the formula we get CI= [49.9%;56.1%]
=> “With 95% confidence, the true proportion of voters satisfied
with Delanoe’s performance is between 49.9% and 56.1%.”
Thinking Challenge

You’re a production manager for a


newspaper.
You want to find the % of defective
newspapers.
Of 200 newspapers, 35 had defects.
What is the 90% confidence
interval estimate of the population
proportion defective?
The Margin of Error

The margin of error is also called the margin of sampling error.


It represents:
• the amount added and subtracted to the point estimate to form
the confidence interval
• the amount of imprecision in the estimate of the population
parameter

Our general formula for confidence intervals then becomes:


Point Estimate ± ME

Point Estimate ± (Critical Value)(Standard Error)


Margin of Error: Certainty vs. Precision
To be more confident, we need more values in our confidence interval to
be more certain it does contain the population parameter. So we need a
larger interval, and we end up being less precise.

Because of this, every confidence interval is a balance between certainty


and precision. The tension between certainty and precision is always
there.

Fortunately, in most cases we can be both sufficiently certain and


sufficiently precise to make useful statements.
Determining Sample Size
The required sample size can be found to reach a desired margin of error (ME)
with a specified level of confidence (1 - a).

Example 1
You work in Human Resources at a [Link] plan to survey employees to
find their average medical expenses in order to negotiate a new health plan with
an insurance [Link] want to be 95% confident that the sample mean is
within ± $50.
A pilot study showed that 𝜎 was about $400. What sample size do you use?

Example 2
As a quality control supervisor in a production chain, you are responsible of
controlling the number of defective items produced on a daily basis.
How large a sample would be necessary to estimate the true proportion of
defective items in a large population within ±3%, with 95% confidence? Assume
a pilot sample yields p2 = 0.12.

You might also like