### INFERENCE

```INFERENCE
What can you find out about a
population by looking at a sample?
Getting started
 You need a population to sample
 There should be a reason to sample
The koala learning activity was developed by Anthony
Are the koalas healthy?
Take a sample and make a dotplot
What do you notice?
What do you notice?
What do you notice?
The median of the population is likely to
be within the range of sample medians.
The median weight of the female koalas is likely to be
between 4.7kg and 5.4kg.
Are the koalas healthy?
Making an inference
The actual population median is 5.1kg.
Usually we only see one sample.
We make an inference that the population median is
the same as the sample median (even though we know
that it is probably not exactly the same).
This is called a point estimate.
Making an interval estimate
At NZC level 7, the idea of the interval is developed
further.
Taking samples of different sizes and collecting the
medians, you can demonstrate that there is less
variation in the medians of large samples than the
medians of small samples.
Collections of medians
Dot Plot
Measures from Sample size 15
40
50
60
70
80
m edian
90
50
60
70
80
m edian
110
Dot Plot
Measures from Sample size 60
40
100
90
Lindsay Smith, University of Auckland
Stats Day 2011
100
Dot Plot
Medians from 200 samples of size 30
110
40
50
60
70
80
m edian
90
100
110
What else might affect the
uncertainty in estimating the
population median?
 The spread of the population
 Comparing the heights of intermediate school (years 7
and 8) and the heights of junior high school students
(years 7 to 10)
Lindsay Smith, University of Auckland
Stats Day 2011
Sampling variability: effect of
Dot Plot
Intermediate
100
120
140
160
180
200
height
120
140
160
120
140
160
200
180
Box Plot
Sample of Middle School
200
120
140
160
180
200
height
height
Box Plot
Sample of Middle School
Box Plot
Sample of Intermediate
120
120
180
height
Box Plot
Sample of Intermediate
100
Dot Plot
Middle School
140
160
180
height
Lindsay Smith, University of
Auckland
Stats Day 2011
200
140
160
height
180
200
population
 Best estimate: using the IQR of our sample
 Using the quartiles of our sample as point estimates for
the quartiles of the population
Lindsay Smith, University of Auckland
Stats Day 2011
Providing an interval estimate (a confidence
interval) for the population median
There are two factors which affect the uncertainty of estimating
the parameter:
1.
Sample size
2.
Spread of population, estimated with sample IQR

How confident do we want to be that our interval estimate
contains the true population median?
Lindsay Smith, University of Auckland
Stats Day 2011
Development of formula for
confidence interval
population median = sample median ± measure of spread
√sample size
To ensure we predict the population median
time
population median = sample median ±
90% of the
√sample size
population median = sample median ± 1.5 x IQR
√n
Lindsay Smith, University of Auckland
Stats Day 2011
Justification for the
calculation
Based on simulations,
 The interval includes the true population median for 9
out of 10 samples - the population median is probably in
the interval somewhere.
 This leads to being able to make a claim about the
populations when they do not overlap.
 Sampling variation only produces a shift large enough to
make a mistaken claim about once in 40 pairs of
samples.
Lindsay Smith, University of Auckland
Stats Day 2011
Comparing two populations
 Sampling variation is always present and will cause a
shift in the medians
 We are looking for sufficient evidence, a big enough shift
in the intervals for the median to be able to make a
claim that there is a difference back in the populations
Lindsay Smith, University of Auckland
Stats Day 2011
“ NCEA level 2 is not an endpoint.
It is a platform.”
```