### Population and Sampling

```What is Statistics?
Statistics is the
gathering, organizing,
analyzing, and
presenting of numerical
information.
The data gathered by
statistical studies are
used to guide decisions,
explain events, predict
future courses of
action, or provide the
basis for a solution to a
problem.
Population vs. Sample
Once you have
decided on the topic
you wish to study,
the first major step
involves gathering
the data. From
whom you are going
to gather the data is
Population

all individuals who belong to a group
being studied
Group being studied
Sample

a selection of individuals taken from
a population
People that are actually
Identify the population for each of
the following questions
a) Whom do you plan to vote for in
the next Ontario election?
All Canadian Citizens that live in Ontario of voting age
_____________________________
b) Do women prefer to wear ordinary
glasses or contact lenses?
Women who require corrective eyewear
_______________________________
Determine if the following is a
sample or a population
a) A representative from each hockey
team is asked to complete a survey
on game times
Sample
__________________
b) Canada census survey
Population
__________________
c) One in every 10 bottles of pop are
tested for defects in a factory
Sample
________________________
Types of Data and
Sampling
Once you have determined the population
that you are considering for your study.
The next step in completing your study is
obtaining a sample that best represents
Sample selection is one of the key
factors that will determine if your survey
is valid and will produce legitimate
conclusions
Types of Data
Raw Data
This is the name
given to data that
has not yet been
analyzed, only
collected.
Discrete Data
There is a limit to
the categories that
data can be placed
in. Ex. The soft
drink size at the
movie theatre
There are only the 4
categories and it is
not possible to go in
between them.
Continuous Data
All rational
values.
The data can take
on any value,
particularly
decimal values of
infinite place
value.
Discrete Data


Population
numbers
Counts of physical
objects where
fractions don’t
make sense
(people)
Continuous Data

Time ( can win a
race in 3 seconds
or 3.4 seconds or
3.148 etc..)

Length

Mass
4 Types of Data
Interval Data
Discrete

This is data that can be linked into
categories but those categories can not be
ranked or quantified
Ex: if a survey asks what type of food you
prefer: Chinese, Italian, American or
Indian.
Discrete

Data is organized into rankings.
Ex: Rank your top five favourite movies.
Matrix = 1
Batman Begins = 2 etc…

The order doesn’t matter as long as the data
can be ranked the way that you want it to
be.
Ex: Matrix = 100
Batman Begins = 300
Discrete


Data is categorized into numerical
groupings in which the distance between
these groupings is the same
The initial or zero point is arbitrary
Ex: Intervals 2006-2007 is the same
2005-2006
Ex: IQ intervals
as
Continuous


All continuous data is Ratio Data.
The name ratio comes from Rational, the
number system which contains decimal
values
Ex: Your time in the 100 m dash
Sampling

The method used to collect sample
data from a population is very
important and can mean the
difference between a credible
conclusion or a biased one
Simple Random
Sampling


Gives all the elements of the
population an equal chance of being
a part of the sample.
Must be as impartial as possible and
not favouring one over the other
Systematic Sample

Selecting a sample from a population
is done systematically or through a
constant counting process
Ex: picking every 100th person from a
phone book

To determine if you should choose
ever 5th or 100th item find the ratio of
the population and sample
If you wanted a tenth of the
population then select every 10th
item.
Ex: A telephone company is planning a
marketing survey of its 760 000
customers. For budget reasons, the
company wants a sample size of about
250.
a)
Determine the interval that should be
used for a systematic sample.
population size
interval =
Therefore the company
sample size
should be selecting every
760000
3040th customer for their

survey
250
 3040
Stratified Sample


Takes into account that a population
is made up of many demographics
that tend to react differently
If a population of turtles has more
females than males, then if the
sample is purposely weighted with
more females than males in a
proportional number to the
population, it is stratified sample.
To determine how many subjects
from each subgroup to select
determine the percent of that
subgroup is in the population and
multiply by the number desired in
the sample
 # subgroup 

  # sample
 population 
Ex: Before booking
bands for the school
dances, the students’
council at Statsville
H.S. wants to survey
the music preferences
of the student body.
The following table
shows the enrolment
at the high school
a) Design a stratified
sample for a survey
of 25% of the student
body
25% of the student body is
880 x 0.25 = 220
# Students
9
255
10
232
11
209
12
184
Total
880
 # subgroup 

  # sample
 population 
 255 

  220
 880 
 63.75

Complete this step
for each grade and
you should get that
there should be:
•
•
•
= 64 gr 9's should be selected •
64
58
52
46
gr
gr
gr
gr
9’s
10’s
11’s
12’s
To check they should add up to 220
Cluster Sample


Takes advantage of groups that have
similar characteristics of other similar
groupings
Randomly selecting whole classes
assuming they are random
Multi-Stage Sample


Uses compound randomization
A study that determines passenger
safety in cars randomly picks a car
manufacturer (stage 1), then
randomly picks a vehicle type like a
van, compact, truck (stage 2), then
randomly picks a type of car in that
class (stage 3).
Ex: Suppose that your population consisted
of all Ontario households. How would you
create a Multi-Staged Sample?
You could first randomly select from the
different towns/cities in Ontario
Then randomly select a sample of blocks or
subdivision within the selected cities
Finally you could then select from
individual homes on that block




Voluntary-Response
Sample
Depends on the initiative of the
sample itself
Internet and mail polls
Elements selected for the sample
may or may not respond
This creates a potential bias
Convenience Sample


Samples local elements that are
nearby or elements that are
accessible with little or no cost
Telephone or internet
Homework
Pg 117 #4,6,8,9,11
Pg 123 # 1-6
```