### 5. Dr Valarmathy - karpagam faculty of medical sciences and research

```SAMPLING TECHNIQUES
Mrs.S. Valarmathi. M.sc., Mphil.,
Research Officer,
Department of Epidemiology
The Tamil Nadu Dr. MGR Medical
University
Is it possible to taste the whole sambar and add
salt
NO
Is it possible to work out what 50 million people
think by asking only 1000?
YES
What exactly IS a Population ?
The entire group under study as defined by research objectives.
Sometimes called the “universe.”
The totality or aggregate of all individuals with the specified
characteristics is a population
TYPES OF POPULATION
•Finite
•Infinite
•Hypothetical
What exactly IS a “sample”?
What exactly IS a “sample”?
A subset of the population.
What exactly IS a “sampling”?
What exactly IS a “sampling”?
Selecting and studying a small
number of subjects from a
specified population in order
to draw inferences about the
whole population
Sampling Terminology
Who do you want to generalize to
THEORETICAL
POPULATION
Sampling Terminology
What population can you get access to ?
STUDYPOPULATION
Sampling Terminology
Who do you want
to generalize to
THEORETICAL
POPULATION
What population can
STUDYPOPULATION
Sampling Terminology
How Can you get access to them?
SAMPLING FRAME
Sampling Terminology
Who do you want
to generalize to
THEORETICAL
THEORETICAL
POPULATION
POPULATION
What population can
STUDYPOPULATION
How Can you get
SAMPLING FRAME
Sampling Terminology
Who is in your study?
THE SAMPLE
Sampling Terminology
Who do you
want to
generalize to
THEORETICAL
POPULATION
What population can
STUDY
POPULATION
How Can you get access
to them?
SAMPLING
FRAME
Who is in your study?
THE SAMPLE
Sampling and representative ness
Study
Population
Sample
Theoretical Population
Theoretical Population  Study Population  Sample
Sampling Fraction
n
N
Sample
• Representativeness express the degree to which the
sample data precisely characterize the population.
• Sample should reflect the study character of the
population .
• Strength of statistical inference also depends on
representativeness.
• Confidence level 95%, 99% for population
Errors
Survey Errors
Random /
Sampling Errors
Systematic / Nonsampling Errors
Why Sampling Errors ?
Sampling error can be reduced simply by increasing the sample size!
S
P
S
S
When you take a
sample from a
population, you only
have a subset of the
population - a piece of
what you’re trying to
understand.
Standard Error
IV Mean
II Mean
V Mean
I Mean
Population Mean
III Mean
The sampling distribution
• The distribution of an
infinite number of
samples of the same
size as the sample in
your study is known as
the
sampling
distribution.
Standard Error
• The standard deviation of
the sampling distribution.
• It tells us something about
how different samples
would be distributed
• A measure of sampling
variability
Systematic / Non-sampling Errors
• Occurs whether a census or a sample is being
used.
• Results solely from the manner in which the
• Cannot be measured.
Types of Non-sampling Errors
• Coverage error
Excluded from
frame.
• Non response error
non responses.
• Measurement error
TYPES OF SAMPLING
Sampling
Non-Probability
Sampling
Probability Sampling
TYPES OF SAMPLING
•Non-Probability Sampling
•Probability Sampling
TYPES OF SAMPLING
Sampling
Probability Sampling
Non-Probability
Sampling
Simple
Random
Convenience
Quota
Judgement
Stratified
Cluster
Snowball
Systematic
Convenience Sampling
The sample is identified primarily by convenience.
Examples:
“Man on the street”
Medical student in the library
Volunteer samples
Patient coming to OP
Problem : No evidence for representativeness.
HAPZHARD SAMPLE
Judgment Sampling
The sampling procedure in which an
experienced research selects the sample
based on some appropriate characteristic of
sample members… to serve a purpose
(Purposive sampling, Deliberate sampling)
Quota Sampling
Attempt to be representative by selecting
sample elements in proportion to their
known incidence in the population
Snowball sampling
Typically used in qualitative research
When members of a population are difficult to
locate, hidden activity groups, non-cooperative
groups
Recruit one respondent, who identifies others,
who identify others,….
Primarily used for exploratory
purposes
Respondent Driven Sampling
• Applicable for Hidden, Hard to reach
populations – MSM, IDU
• A systematic form of snowball sampling with
unique identification procedure.
• Depends social network of target population
• Under certain assumptions may be treated as
a Random sample
Steps involved in RDS
• Begin with a small set of identified seeds.
• Seeds recruit peers, who recruit their peers,
etc., continued till required sample size is
achieved.
• Recruits are linked by coupons with unique
identifying numbers.
• Incentives provided for participation and each
successful recruit.
Wave 1
Seed
Wave 2
Wave 3
Wave 4
Wave 5
Wave 1
Seed
Wave 2
Wave 3
Wave 4
Wave 5
Wave 1
Seed
Wave 2
Wave 3
Wave 4
Wave 5
Wave 1
Seed
Wave 2
Wave 3
Wave 4
Wave 5
Wave 1
Seed
Wave 2
Wave 3
Wave 4
Wave 5
Wave 1
Seed
Wave 2
Wave 3
Wave 4
Wave 5
 No need of sampling frame / mapping
 Ease of field operations - Target members
recruit samples for you.
 Reach less visible segment of population
CCPUR IDU Network
HIV – ve
HIV +ve
Non Probability Sampling
Methods
Convenience sampling relies upon convenience and access
Judgment sampling relies upon belief that participants fit characteristics
Quota sampling emphasizes representation of specific characteristics
Snowball sampling relies upon respondent
referrals of others with like characteristics
Probability samples
A sampling that selects subjects with a known, non
zero, probability.
Removes possibility of bias in selection of subjects.
Allows application of statistical theory to results.
Important when one wishes to generalize the
findings of the sample to the larger population from
which samples are selected.
Simple random sampling
Applicable when population is small, homogeneous &
Required number of units are selected randomly.
Each unit of the frame has an equal non zero
probability of selection.
Simple random sampling
Merits
•
Easy to implement if list frame available or small
population
•
Approximately satisfies the sampling model on
which conventional statistics is based, so we can
carry out complex analyses
Demerits
•Need complete list of units
•Units may be scattered
SRS METHODS
1. LOTTERY METHOD
2. RANDOM NUMBERS TABLE
3. Computer Generated Random numbers
Simple random sampling
Example: evaluate the prevalence of
hypertension among the 1200 children
attending schoolin the age group 14 to 17
years.
List of children attending the school
Children numerated from 1 to 1200
Sample size = 100 children
Random sampling of 100 numbers between 1
and 1200
Simple random sampling
Table of random numbers
57172
33883
77950
11607
56149
80719
93809
40950
12182
13382
38629
60728
01881
23094
15243
53501
07698
22921
68127
55309
92034
50612
81415
38461
07556
60557
42088
87680
67344
11596
55678
65101
19505
86216
59744
48076
94576
32063
99056
29831
21100
58431
24181
25930
00501
10713
90892
84077
98504
44528
24587
50031
70098
28923
10609
01796
38169
77729
82000
48161
65695
73151
48859
12431
46747
95387
48125
68149
01161
79579
37484
36439
69853
41387
32168
30953
88753
75829
11333
15659
87119
24498
47228
83949
79068
17646
83710
48724
75654
23898
08846
23917
05243
25405
01527
43488
99278
65660
06175
54107
17822
08633
71626
05622
26902
09839
15859
17009
49931
83358
45552
24164
41125
35670
17152
23683
01331
07421
16181
23463
17046
13211
28751
72554
61221
09190
49946
08049
64864
30237
29959
45817
74577
67119
94303
75230
86776
35513
14291
38453
66516
10853
88163
97869
39641
49168
31460
71120
80855
77021
76825
74305
37545
68698
54986
77795
43909
89405
42791
00614
67448
56624
48980
94057
74773
63154
78796
04038
74462
88092
36970
02048
91507
91715
02035
46279
18239
68196
47201
08759
38964
41870
49607
70743
75889
49529
31286
27549
56684
51834
66391
58116
73099
75246
14551
72201
99522
31522
16050
49881
10910
22705
47687
75634
85224
45611
83534
26300
Generating Random Numbers
•This is a better and perhaps more
efficient for selecting a simple random
sample.
•Computers and even your calculators
can be used to generate random digits.
The randomly produced digits can be
used to pick your samples.
•However, a complete listing of the
members of the population is needed in
this type of random selection.
Excel:
Enter the function =
RNDBETWEEN () on any
blank cell
F9 refreshes the random
numbers
Through
Calculator
Press
SHIFT
·
=
RAN#
Systematic random sampling
The defined target population is ordered and
the sample is selected according to position
using a skip interval
Systematic random sampling
Systematically spreads sample through a list
of population members
In nearly all practical examples, the
procedure results in a sample equivalent to
SRS
INTERVAL SAMPLING
Systematic random sampling
N = 1200,
and n = 60
 sampling interval = 1200/60 = 20
List persons from 1 to 1200
Randomly select a number between 1 and 20
(ex : 8)
the
 1st person selected = the 8th on
list
 2nd person = 8 + 20 = the 28th
etc .....
Systematic sampling
1. Careful that there is no systematic rhythm to
the flow or list of people.
2. If every Kth person on the list is, say, “rich” or
“senior” or some other consistent pattern,
avoid this method
Stratified Random Sampling
A method of probability sampling
in which the population is divided
into different subgroups and samples
are selected from each
Stratified Random Sampling
Methods
1. Proportional Allocation Method
2. Equal Allocation Method
Proportional Allocation Method
Epidemiological profile of tuberculosis under
12 years of age. Sample size is 120
centre 1 - 56% -67
centre 2– 24% - 29
Centre 3 – 20% - 24
Equal Allocation Method
Epidemiological profile of tuberculosis under
12 years of age. Sample size is 120
centre 1 - 40
centre 2– 40
Centre 3 – 40
CHENNAI
NORTH
GOVT.
SOUTH
PRIVATE
5 SCHOOLS
PRIVATE
5 SCHOOLS
EACH
10
STUDENTS
5
GIRLS
GOVT.
EACH
10
STUDENTS
5
5
BOYS GIRLS
5
BOYS
CENTRAL
GOVT.
PRIVATE
EAST
GOVT.
PRIVATE
Cluster Sampling
• Population by it self is divided into number of natural
groups known as clusters (geographic or
organizational) .
• The units are heterogeneous within cluster but
homogeneous between clusters.
• Cluster sample is obtained by selecting the clusters
by simple random sampling and all the units in the
sampled clusters are included in the sample
Cluster Sampling
– Sampling frame is not required
– Simple and Easy
– Less resources required
– Imprecise if units within clusters are
homogeneous
Cluster Sampling
Randomly select
Clusters and select all
subjects
Randomly select Clusters
and select subjects randomly
Cluster Sampling
Especially useful for door-to-door personal
surveys (significantly reduces costs)
However, clustering increases sampling
errors (people who live close together tend
to be more similar)
Drawing the clusters
You need :
Map of the region
Distribution of population (by Taluks or area)
Age distribution (population 5-12:3%)
Taluks
Mettur
Sankari
Salem
Omalur
Yercaud
Attur
Gangavalli
Pop.
5-12
53000
7300
106000
13000
26500
6600
40000
6600
53000
1600
220
3200
400
800
200
1200
200
1600
Taluks
5-12
Mettur
Sankari
Salem
Omalur
Yercaud
Attur
Gangavalli
1600
220
3200
400
800
200
1200
200
1600
Then compute sampling
fraction :K = 9420/30
= 314
Taluks
Mettur
Sankari
Salem
Omalur
Yercaud
Attur
Gangavalli
5-12
1600
220
3200
400
800
200
1200
200
1600
5
1
10
1
2
1
4
1
5
Drawing households and children
On the spot
Go to the center of the Taluk , choose direction
(random)
Number the houses in this direction
 Ex: 21
Draw random number (between 1 and 21) to
identify the first house to visit
From this house progress until finding the 7
children ( itinerary rules fixed beforehand)
Multistage Sampling
• Selection of subjects is done in stage by stage
• Any one of the sampling schemes can be
applied during each stage
• Multistage sampling generally ends with
unequal probability to sampling unit
• Analysis procedure becomes more complex.
Multistage Sampling
– No sampling frame of population required
– Most feasible approach for large populations
– Several sampling lists
– Needs more man power
Multi Stage Sampling
• District Level Household & Facility Survey
– Stage – 1 Selection of District
– Stage – 2 Selection of Villages
– Stage – 3 Selection of Households
• Immunization coverage in a state
– Stage – 1 Selection of District
– Stage – 2 Selection of PHCs
– Stage – 3 Selection of Subcenters
Probability Sampling Methods
Simple random sampling relies upon simple randomization
Systematic sampling relies upon on the sampling interval
Stratified sampling emphasizes dividing into groups and subgroups
Cluster sampling
relies upon
geographical
or organizational groups
Cluster
Clustersampling
samplingrelies
reliesupon
upongeographical
geographicalor
ororganizational
organizationalgroups
groups
Factors Affecting Choice of
Sampling Design
Sampling Frame: Existence and Size
Costs
Precision Desired
Sub-Population Comparisons
TO SUMMARIZE
•Population
•Sample
•Standard Error
•Types Of Sampling
•Non Probability sampling
•Probability Sampling
Social actors are not predictable
like objects.
Randomized events are irrelevant
to social life.
Probability sampling is expensive
and inefficient.
Therefore…
Non-probability sampling is the
best approach.
We want to generalize to the
population.
Random events are predictable.
We can compare random events
to our results.
Therefore…
Probability sampling is the best
approach.
Conclusions
Probability samples are the best
Beware of …
•refusals
•absentees
•“do not know”
Ensure
•Validity
•Precision …..within available constraints
Conclusions
If in doubt…
Call an Experienced Person !!!!
Or
Call a statistician !!!!
Professor: Hope u understand the sampling
techniques
Student: It is impossible to draw conclusion
for the whole population by drawing samples
Professor explained the whole thing again
Student: I am not convinced
Professor: Well, so next time when you go for
a blood test ask them to extract all the blood