### Chapter 10

Estimating with
Confidence
10-1
Estimating an unknown
parameter


Ex: The admissions director at a University
proposes using the IQ scores of current
students as a marketing tool. The university
provides him with enough \$ to administer IQ
tests to 50 students. So, he gives the IQ test to
an SRS of 50 of the university’s 5000 freshman.
The mean IQ score for the sample is
= 112.
What can the director say about the mean
score μ of the population of all 5000
freshman?
Is the mean IQ score μ of all the freshman
exactly 112? Probably not.
but…



The law of large numbers tells us that the
sample mean from a large SRS will be close
to the unknown population mean μ.
Because = 112, we guess that μ is
“somewhere around 112”.
How close to 112 is μ likely to be?

would the sample mean vary if we took many
samples of 50 freshman from this same
population?
Recall…
 From
last chapter, the means of all samples of size
50 would distribute normally around the true
population mean μ with a standard deviation of
σ/√50
 Remember from our 68-95-99.7 rule that 95% of all
samples of size 50 will have a mean that falls within
2 standard deviations of μ.
Suppose we know σ
 Suppose
we know σ is 15 (this is unrealistic,
but just go with it). That means Sx = 15/√50
= 2.1
 So,
in 95% of all samples of size 50, the
mean IQ score ( ) will deviate from the
true μby 4.2 (up or down…that’s 2
standard deviations above or below).
Here are all our samples…
How confident are we?

Statistical inference uses this fact about what
would happen in MANY samples, to express
our confidence in being able to capture the
true μ in our own ONE sample.

Conclusion: Our sample of 50 freshman gave
=
112. The resulting interval is
112 +/- 4.2 (107.8, 116.2).

We say that we are 95% confident that the
unknown mean IQ μ for all Big City University
freshman is between 107.8 and 116.2.
Confidence Interval for Population
Mean μ when σ is known

That example was our first scenario for calculating a CI.
The calculation depends on 3 important conditions:



1. SRS: the sample comes from a proper sample
2. Normality: The construction of the interval depends on
the fact that the sampling distribution of sample means is
approximately normal (which it will be, according to the
CLT, , as long as our sample sizes are sufficiently large…30
is a usual cutoff)
3. Independence: To keep calculations reasonably
accurate when we sample from a finite population, we
should sample no more than 10% of the population (our
rule of thumb)
Different Confidence Levels
and Critical Values

We call our confidence level a C level. While a
95% confidence interval (or confidence level) is
most typical, sometimes you are asked for a 99%
or 90% interval.

Note that for the 95% CI, we constructed it in the
example by taking the Z score, (2 standard
deviations) above and below the mean (Z = 1.96
to be precise).


For 90%, a Z score corresponding to our ‘cutoff’
regions is +/- 1.645
For 99% it’s +/- 2.576
*This is our confidence interval for the estimate of
the unknown μ as and our margin of error is
Formula
On the calc: If you have raw data, enter your sample data into L1.
Press STAT, choose TESTS, and choose Z: interval. Input method is
Data.
If you have and σ and select ‘stats’ as your input method and type
those in along with n (sample size) and c-level you want (such as .95
or .99). Then choose calculate and hit enter. You get the interval
(lower and upper bound) and the sample mean.
Margin of Error

There is a tradeoff between margin of error and level
of confidence. The margin of error gets smaller as Z*
gets smaller (but this also lowers our confidence)

MOE also gets smaller as σ gets smaller (this is hard in
reality, but important conceptually). Think of σ and
variability as ‘noise’- it’s easier to pin down the true μ
when σ is small.

MOE smaller when n gets larger. Because we take
the square root of n we must take four times as many
observations in order to cut the margin of error in
half.
Example and Steps to solving
 Suppose
the manufacturer of video
terminals wants to test screen tension. We
know that when the process is operating
properly, the σ = 43. Here are the tension
readings from an SRS of 20 Screens:
269.5 297
269.6 283.3 304.8 280.4 233.5 257.4 317.5 327.4
264.7
310
307.7
343.3
328.1
342.6
338.8
340.1
374.6
336.1
Steps

1. Parameter- identify the population of interest and the
parameter you want to draw conclusions about. The
population here is “all video terminals”. We want to estimate μ,
the mean tension for all these screens.

2. Conditions- choose the appropriate inference procedure.
Verify the conditions for using it. Since we know σ , we should
use one sample z interval. Now check requirements:
1. SRS (yes)
 2. Normality: is the sampling distribution approximately normal? (Yes)
The sample size is too small (n = 20) to use the central limit theorem
(n>30 is our cutoff) so we look at a boxplot of the sample tension
readings (calc). No outliers or strong skewness. The normal probability
plot tells us that the sample data is approximately normally distributed.
This data gives us no reason to doubt the normality of the population
from which they came.
 Independence: Since we are sampling without replacement, we
must assume that at least 200 video terminals (10)(20) were
produced that day.

See P. 631 for summary!
 Step
3: calculations – if conditions are
met, carry out the CI inference procedure
for 90% CI.

Enter data in calc,
= 306.3
 306.3
+ 1.645(43/√20) = 322.1
 306.3 - 1.645(43/√20) = 290.5

Step 4: Interpretations: So, we are 90%
confident the true μ tension lies between
(290.5 and 322.1). Always state this part
IN CONTEXT! If you wanted to change
the confidence level (say to 99%),
change your Z* (2.57) and you widen
Sample size for a desired
margin of error
 Note-
it’s the size of the sample that determines
margin of error, the size of the pop does not
influence the sample size we need (this is true as
long as the population is much larger than the
sample)
What if we don’t know σ?

We previously made the unrealistic assumption that we
knew the value of σ. In practice, σ is usually unknown so
the one sample z interval is rarely used in real life.

So, we use our sample standard deviation Sx as an
estimate for σ. But we must be punished/penalized for
this!

We divide it by n and so our estimated population
standard deviation now changes depending on the size
of our sample. We call this ‘estimated’ standard
deviation the ‘standard error’.

Because of this, we can’t use a normal “Z” distribution for
our critical values…instead we use “t”.
Critical T’s
 As
our N gets bigger, the t distribution gets
closer and closer to the normal Z distribution.
 The T distribution is based on degrees of
freedom which is (n-1) instead of n.
 As our sample size gets bigger, n-1 has less
impact as compared to n.
 Table C gives us critical values for T based on
the degrees of freedom (n-1) – so does calc
(calc is preferable).
Formula
 So
the only things that change when we don’t
know our population standard deviation is our
critical value is now a critical t (we can use the
table or the calc…calc recommended) and
the standard deviation we are using is:
 On
calc, same as Z interval, just choose T
interval
Paired t procedures
 Comparative
studies are more convincing
than single-sample investigations.
(matched pair design).
 We use these to compare treatments on 2
different subjects, or before-and-after
observations on the same subject.
Important distinction




There are 2 types of studies we learned about earlier: Matched-Pairs design
(which includes before-after studies on each individual in our sample, and
comparisons between each individual of a pair of similar individuals that we
split and assigned to 2 treatments), and comparative studies of 2
INDEPENDENT groups.
When calculating the T-interval on a matched pairs design, we are interested
in the DIFFERENCE between the 2 conditions (whether this is a before/after on
one individual, or 2 similar individuals being compared). You will always have
an equal number in both groups if you are doing matched pairs. For this you
define L3 as L1-L2 and do a 1 sample T interval on L3.
In a comparative independent samples design, the 2 samples are
INDEPENDENTLY groups (and therefore may even have different numbers in
each). They are not matched up in any way- this is what would be a 2sample T-interval based on L1 and L2.
*For this chapter, 99% of examples will be of the first variety where you take
the differences and do a one sample interval on L3. In later chapters we
deal with situation 2 more, but it still helps to recognize the difference now.
Example
 Caffeine
dependence/depression.
Population is all people dependent on
caffeine. We want to estimate the mean
difference diff = placebo - caffeine in
depression patients


11 people tested and their scores on a
depression test measured (placebo vs.
caffeine) (P. 652)
Calc- 2 sample t interval OR, define list 3 as
L1 – L2 and do a 1 sample t interval on L3
Depression data
Subject
Caffeine
Placebo
1
5
16
2
5
23
3
4
5
4
3
7
5
8
14
6
5
24
7
0
6
8
0
3
9
2
15
10
11
12
Robustness
 If
outliers are present in the sample data,
then the population may not be Normal.
The t procedures are NOT robust against
outliers because and s are not resistant to
outliers.
CI’s for proportions

As always, inference is based on the sampling
distribution of a statistic.

Center: the mean is rho. We call the sample proportion
(p-hat) is an unbiased estimator of the population
proportion p.

Spread: Standard deviation of p hat is √[ρ(1-ρ)/n]
provided that the population is at least 10 times as large
as the sample.

Shape: If the sample size is large enough that both np
and n(1-p) are at least 10, the distribution of p-hat is
approximately normal.



In reality, we don’t know the value of rho (if
we did, we wouldn’t need to construct a CI
for it!)
So we cannot check whether (n)(rho) and
n(1-rho) > 10.
In large samples, p hat will be close to rho so
we replace rho by p-hat in determining the
values of (n)(rho) and n(1-rho) and so our
Standard Error (estimated population
proportion standard deviation) is
Remember- P-hat (sample proportion) is
the number of successes in your sample
divided by total number of individuals in
Calculator- CI for a proportion
 Press
STAT, choose TESTS and 1-propZint.
Enter x (lets say 246), n(lets say 439) and
C-level (.95). Calculate.
Choosing a sample size
 When
planning a study, we may want to
actively choose a sample size that will
allow us to estimate the parameter within
a given margin of error.
P*


When calculating sample size for a specific
margin of error, we often don’t know P-hat or
Rho (we are running the study in the first
place to find this out!)
When you don’t have a ‘best guess estimate’
for your proportion of successes in the
population, we make P* = .5(because that’s
our most conservative estimate of probability
for success: 50/50).

obviously if you are given rho, or you know p
hat, use that- it’s our best estimate
Example: P* unknown

A company wants to do a customer service
survey where customers rate the service on a
scale of 1 – 5 with 4 being satisfied and 5
being very satisfied. The President is
interested in the percent of customers who
rate them a 4 or a 5. She wants the estimate
to be within 3% at a 95% confidence level. It’s
too expensive/unreliable to try to question
every customer, how many people should
they survey?
Example continued

P* = .5 since we don’t know the true population
proportion (rho)

1.96 [(√(.5)(.5)/n)] ≤ .03 Do some algebra…
[(1.96)(.5)] /.03 ≤ √n
n ≥ 1067.11
So we round up to 1068 participants.




*News reports frequently describe the results of
surveys with sample sizes between 1000 and 1500
and a margin of error about 3%.
Summary
 See
P. 679 for a good summary…