chapter 4 slides

Report
What is a sample?
Epidemiology matters: a new introduction to methodological foundations
Chapter 4
Seven steps
1.
Define the population of interest
2.
Conceptualize and create measures of exposures and health
indicators
3.
Take a sample of the population
4.
Estimate measures of association between exposures and health
indicators of interest
5.
Rigorously evaluate whether the association observed suggests a
causal association
6.
Assess the evidence for causes working together
7.
Assess the extent to which the result matters, is externally valid,
to other populations
Epidemiology Matters – Chapter 1
2
1. Why take a sample?
2. How to take a representative sample
3. Quantifying sampling variability
4. How to take a purposive sample
5. Study design
6. Summary
Epidemiology Matters – Chapter 4
3
1. Why take a sample?
2. How to take a representative sample
3. Quantifying sampling variability
4. How to take a purposive sample
5. Study design
Epidemiology Matters – Chapter 4
4
Why take a sample?
 Epidemiologists take samples to answer healthrelated research questions efficiently
 A full census is the epidemiologic ideal
 Reasons not to take a census all the time include lack
of time, lack of money, and waste of resources
Epidemiology Matters – Chapter 4
5
To take a sample
1. Specify population of interest
2. Specify a research question of interest
Epidemiology Matters – Chapter 4
6
Specify population of interest
 What are the characteristics of the population in
which we would like to understand health?
 Example: Do we want to know what the
prevalence of diabetes is within New York City?
New York State? The United States? Do we want
to know the causes of diabetes?
 The population of interest has to be specified before
the sampling strategy defined
Epidemiology Matters – Chapter 4
7
Specifying a question
 Question of interest can help clarify appropriate
way to sample population of interest
 Questions asked can include estimating population
parameters, or estimating causal effects of
exposures on outcomes
Epidemiology Matters – Chapter 4
8
Example, estimating population parameters
Questions concerned with population parameters include
 What proportion of individuals in the population of interest has
breast cancer?
 What is the mean blood pressure in the population?
 How many new cases of HIV are diagnosed in the population
over three years?
Population parameters include estimates of
 Proportions
 Means
 Standard deviations
Sample required
 Representative sample
9
Example, estimating causal effects of
exposures on outcomes
Questions for which these measures are needed are
 Does exposure to pollution cause lung cancer?
 Does suffering abuse in childhood cause depression in
adulthood?
 Does a specific genetic marker cause Alzheimer’s disease?
Parameter of interest
 Causal effect of an exposure on a health outcome
Sampling concerns
 Not representativeness (as in population parameters)
 Whether individuals exposed to hypothesized cause of interest
are comparable to individuals not exposed
 Purposive sample sufficient
10
Representative and purposive
 A representative sample is one where the sample
that is taken has characteristics similar to the overall
population
 A purposive sample selects from the population
base on some criterion
 A representative sample may or may not include
individuals who are comparable with respect to
causal identification
 A purposive sample may or may not be
representative of a particular population of interest
Epidemiology Matters – Chapter 4
11
1. Why take a sample?
2. How to take a representative sample
3. Quantifying sampling variability
4. How to take a purposive sample
5. Study design
6. Summary
Epidemiology Matters – Chapter 4
12
How to take a representative sample
 The simplest approach: a simple random
sample
 Each member of the population has an equal
probability of being selected into the sample
 A successful simple random sample should
have the same basic characteristics as the
original population
Epidemiology Matters – Chapter 4
13
Taking a simple random sample
1. Enumerate all potential members of
population of interest
2. Assign each member a probability of selection
3. Ensure selection of members are independent
Epidemiology Matters – Chapter 4
14
Example: Sampling Farrlandia
30 residents in Farrlandia
Options for random selection:
--Every 4th home, dice roll
for selection within home
Challenges include (a)
clustered exposures, (b)
unequal ‘home’ size
Selected for sample
Epidemiology Matters – Chapter 4
15
Example: Sampling Farrlandia
30 residents in Farrlandia
Select every Nth person in
phone book
Challenges include that not
everyone is in phone book
Selected for sample
Epidemiology Matters – Chapter 4
16
The perfect sample?
 There is no perfect sample
 The goal in epidemiology is to understand
limitations of sampling methods and account for
them
Epidemiology Matters – Chapter 4
17
Sampling Farrlandia
Epidemiology Matters – Chapter 4
18
Sampling Farrlandia
We want to collect our sample in such a way
that the sample also has 50% exposed and 30%
dotted.
Epidemiology Matters – Chapter 4
19
Sampling Farrlandia
We can use a simple random sample
 ½ the population (25)
 Probability of selection 1/50 or 2%
 Random number generator
Epidemiology Matters – Chapter 4
20
Sampling Farrlandia
Original Population
Black solid
15
30%
Black dots
10
20%
Total black
25
50%
Gray solid
20
40%
Gray dots
5
10%
Total gray
25
50%
Epidemiology Matters – Chapter 4
21
Sampling Farrlandia
Original Population
Sample
Black solid
15
30%
Black solid
8
32%
Black dots
10
20%
Black dots
5
20%
Total black
25
50%
Total black
13
52%
Gray solid
20
40%
Gray solid
10
40%
Gray dots
5
10%
Gray dots
2
8%
Total gray
25
50%
Total gray
12
48%
Epidemiology Matters – Chapter 4
22
1. Why take a sample?
2. How to take a representative sample
3. Quantifying sampling variability
4. How to take a purposive sample
5. Study design
6. Summary
Epidemiology Matters – Chapter 4
23
Quantifying sampling variability
 Sampled population will not have the exact
same population parameters as complete
population census
 The ‘truth’, i.e., the population parameter of
original population is called the true
population parameter
Epidemiology Matters – Chapter 4
24
Variations in possible samples
Epidemiology Matters – Chapter 4
25
Variations in possible samples
Epidemiology Matters – Chapter 4
26
Variations in possible samples
Epidemiology Matters – Chapter 4
27
Variations in possible samples
38,760 different
possible samples of 5
Epidemiology Matters – Chapter 4
28
Quantifying uncertainty, Central Limit
Theorem (CLT)
1. Average proportion across all possible
samples = true population proportion
 Example:
 50% of true population has diabetes
 Sample 1 has 100% diabetes
 Sample 2 has 0% diabetes
 Average of all samples will have 50% diabetes
Epidemiology Matters – Chapter 4
29
Quantifying uncertainty, CLT
2. Variance around average sample proportions
(standard error)
p = sample proportion
n = sample size
Epidemiology Matters – Chapter 4
30
Quantifying uncertainty, CLT
3. Large samples will have normally distributed
samples
 > 30 people
 No group < 5 people
Epidemiology Matters – Chapter 4
31
Quantifying uncertainty, CLT
Therefore the principal drivers of uncertainty are
1. Prevalence in the sample
2. Sample size
The larger the sample size, the smaller the
amount of uncertainty in the sample estimate
Epidemiology Matters – Chapter 4
32
1. Why take a sample?
2. How to take a representative sample
3. Quantifying sampling variability
4. How to take a purposive sample
5. Study design
6. Summary
Epidemiology Matters – Chapter 4
33
Purposive sample
 Eligibility criteria for study is the central design
element; entry is based on exposure status, or
sometimes on health outcome status
Epidemiology Matters – Chapter 4
34
1. Why take a sample?
2. How to take a representative sample
3. Quantifying sampling variability
4. How to take a purposive sample
5. Study design
6. Summary
Epidemiology Matters – Chapter 4
35
Study design

Study design considerations are similar for
representative or purposive sample

Study design reflects decisions made at one
time point or over time

Timing of disease process can inform the
study design
Epidemiology Matters – Chapter 4
36
Study design options
1. Sample one moment in time, irrespective of
disease status, measure disease and
potential cause simultaneously
2. Sample over time, start with disease free
individuals only, measure disease over time
3. Sample one moment in time, based on
disease status
Epidemiology Matters – Chapter 4
37
Farrlandia population
Epidemiology Matters – Chapter 4
38
Farrlandia population
Epidemiology Matters – Chapter 4
39
Farrlandia population
Epidemiology Matters – Chapter 4
40
Farrlandia population
Epidemiology Matters – Chapter 4
41
Farrlandia population
Epidemiology Matters – Chapter 4
42
Farrlandia population
Epidemiology Matters – Chapter 4
43
Farrlandia population
Epidemiology Matters – Chapter 4
44
Farrlandia population
Epidemiology Matters – Chapter 4
45
Farrlandia population
Epidemiology Matters – Chapter 4
46
Option 1, Cross-sectional
Epidemiology Matters – Chapter 4
47
Option 2, Cohort
Epidemiology Matters – Chapter 4
48
Option 3, Case-control
Epidemiology Matters – Chapter 4
49
1. Why take a sample?
2. How to take a representative sample
3. Quantifying sampling variability
4. How to take a purposive sample
5. Study design
6. Summary
Epidemiology Matters – Chapter 4
50
Summary
1. Samples are efficient, representative or purposive
2. Representative sample; e.g., simple random sample
3. Sampling variability, standard error
4. Purposive sample, selection on exposure or disease
status
5. Study designs can be cross-sectional, cohort, casecontrol
Epidemiology Matters – Chapter 4
51
Seven steps
1.
Define the population of interest
2.
Conceptualize and create measures of exposures and health
indicators
3.
Take a sample of the population
4.
Estimate measures of association between exposures and health
indicators of interest
5.
Rigorously evaluate whether the association observed suggests a
causal association
6.
Assess the evidence for causes working together
7.
Assess the extent to which the result matters, is externally valid,
to other populations
Epidemiology Matters – Chapter 1
52
epidemiologymatters.org
Epidemiology Matters – Chapter 1
53

similar documents