강의자료_손희정_20130325_1

Report
Cross Sectional Studies
Son Hee Jung
2013/03/25
Type of Epidemiological Studies
Type of study
Experimental
RCT
Observational
Ecological
Cross sectional
Case-control
Cohort
Alternative name
Unit
clinical trial
individuals
correlational
prevalence
case-reference
follow up
population
individuals
individuals
individuals
Study Designs & Corresponding Questions
• Cross-sectional
• Ecologic
• Case-control
• Prospective
How common is this disease
or condition?
What explains differences
between groups?
What factors are associated
with having a disease?
How many people will
get the disease?
What factors predict
development?
Contents
Definition
Basic approach
Advantage & disadvantage
Sampling
Measures of disease
– Prevalence
• Bias
•
•
•
•
•
Cross-sectional study-definition
Cross Sectional Study
연구대상 집단
한 시점
연구 진행
요인 노출과 질환에 관한 정보 수집
Cross-sectional study- Characteristics
Basic approach
• Include a sample of all persons in a population at
a given time without regard to exposure or
disease status
• Typically exposure and diseases assessed at that
one time
• Exposure subpopulations can be compared with
respect to disease prevalence
Basic approach
• For some questions, temporal ordering between
exposure and disease is clear and cross sectional
studies can test hypothesis
– Example: genotype, blood type
• When temporal ordering is not clear can be used
to examine relations between exposure and
outcomes descriptively, and to generate
hypotheses
• Can combine a cross sectional study with follow
up to create a cohort study
Basic approach
• Issues with addressing etiology
– Temporal ordering between exposure and
outcome cannot be assured
– Length biased sampling
• Cases with long duration will be over
represented
Cross -Sectional Studies: Advantages
• Inexpensive for common diseases
• Should be able to get a better response rate than
other study designs
• Relatively short study duration
• Can be addressed to specific populations of
interest
Cross-Sectional Studies : Disadvantages
• Unsuitable for rare or short duration diseases
• High refusal rate may make accurate prevalence
estimates impossible
• More expensive and time consuming than casecontrol studies
• No data on temporal relationship between risk
factors and disease development
Why sample?
Sampling from the source population
Non-probability sampling
• Common convenience sampling methods
– Street surveys
• Use convenient place such as mall, hospital
– Mail-out questionnaires
• Most dangerous
• Feel very strongly about the issue->bias
– Volunteer call
• Selection bias
Non-probability sampling-Convenience sampling
• Select a sample through an easy, simple or
inexpensive method
• Problem
– High risk of creating a bias
– May provide misleading information
– Can be accepted, but…
• Be careful in assessing
• And the results they produce
Basic probability sampling
• Simple random sampling
– Each sample of the chosen size has the same
probability of being selected
Basic probability sampling
• Systematic sampling
– Obtain a lost of an available population,
ordered according to an unrelated factor
– Pick a number n as step size
– Pick every n-th subject of the list
Stratified random sampling
Cluster random sampling
Multistage sampling
The National Health and Nutrition
Examination Survey (NHANES)
NHANES Interviews & Examinations
• ㅍ
NHANES Sample Design
Analyses of NHANES Data
Weighting in NHANES
• ㅍ
NHANES base probability of selection
• ㅍ
Oversampling
Sample Weights
Why weight?
Probability weight – simple example
Example of weighting
• Imagine 100 male & 100 female in
sample
• But only 80 males & 75 females respond
• Male respondent will get weight of
– 100/80->1/(80/100)=1.25
• Female respondent will get weight of
– 100/75->1/(75/100)=1.33
국민건강영양조사의 표본추출방법 예
다단계 표본추출
• 단순무작위 표본추출의 실제적 어려움을 해결하
기 위해 고안된 방법
– 전국 규모의 여론조사에 이용
– “series” of simple random samples in stages
국가
시군구
random
sampling
시도
• 국민건강영양조사
random
sampling
random
sampling
읍면동
유병률 산출: 가중치 적용
• 목적: 국민건강영양조사의 표본이 우리나라 국민
을 대표하도록 가중치를 사용
Direct age adjustment-before
A
B
population
No. of death
Death rates
per 100,000
population
No. of death
Death rates
per 100,000
900,000
862
96
900,000
1,130
126
A
Age group
population No. of death
B
Death rates
per 100,000
population
No. of death
Death rates
per 100,000
All ages
900,000
862
96
900,000
1,130
126
30-49
500,000
60
12
300,000
30
10
50-69
300,000
396
132
400,000
400
100
70+
100,000
406
406
200,000
700
350
Direct age adjustment-after
A
B
population
No. of death
Death rates
per 100,000
population
No. of death
Death rates
per 100,000
900,000
862
96
900,000
1,130
126
Age group
All ages
30-49
50-69
70+
Total
Standard
population
1,800,000
800,000
700,000
300,000
Age-adjusted rates
Age-adjusted rates:
“A" age-specific
mortality rates
per 100,000
Expected No. of
deaths using
“A" rates
12
132
406
96
924
1,218
2,238
124.3
2238/1800000=124.3
“B" age-specific
Expected No. of d
mortality rates per 10
eaths using
0,000
“B" rates
10
100
350
80
700
1,050
1,830
101.7
1830/1800000=101.7
Indirect age adjustment (Standardized
Mortality Ratio)
• When
– number of deaths for each age-specific strata are not
available
– Study mortality in an occupational exposure
population
• Defined
Observed number of deaths per year
SMR=
Expected number of deaths per year X100
• SMR of 100
• Observed number of deaths is the same as expected number of
deaths
Sampling, Inference, and generalization
Sampling, Inference, and generalization
Sampling, Inference, and generalization
If you tell the truth you don't have to remember anything. by Mark Twain 1894
Why do we measure disease prevalence?
Measuring burden: prevalence
Prevalence
Measuring burden: prevalence
Person-time at risk: exposed and unexposed
Censored individuals
Censoring
Measuring of prevalence
Point and period prevalence: example
Point prevalence at several time points
Period prevalence
Lifetime prevalence
Life time prevalence 4/5
Prevalence of diabetes
Utility of prevalence
Sloppy use of risk
Sloppy use of rate
Classic example of rate that is not a rate
Case fatality(rate?)
Proportional mortality (rate?)
Total deaths united states 2004
Deaths , U.S. 2004 ages 20-24 Years
What ‘s a possible inferential problem with
proportional mortality?
Measuring risk: cumulative incidence
Measuring risk: cumulative incidence
Cumulative incidence is a proportion
Calculating the cumulative incidence
Odds
Odds
Odds
Odds
Odds and probabilities
• The higher the incidence, the higher
the discrepancy.
Prevalence, Incidence, disease duration
Disease prevalence depends on
Incidence rates can be calculated for each
transition in health status
Incidence rates can be calculated for each
transition in health status
Relationship among prevalence, incidence
rate, disease duration at steady state
Relationship among prevalence, incidence
rate, disease duration at steady state
Relationship among prevalence, incidence
rate, disease duration at steady state
Mean duration of disease
Relationship among prevalence, incidence
rate, disease duration at steady state
Relationship among prevalence, incidence
rate, disease duration at steady state
Relationship among prevalence, incidence
rate, disease duration at steady state
What does steady state mean in the context
of estimating P from I and D?
Example varying prevalence, incidence rates
and duration of disease
Cross-sectional Bias
• Incidence-Prevalence bias
– Type of selection bias
– If exposed cases have different duration that no-exposed
prevalent cases, prevalence ratio will be biased
– E.g., cases with severe emphysema more likely to smoke, have
higher fatality than cases with less severe emphysema, so the
prevalence of emphysema in smokers will be underestimated
compare to incidence
– Solution-use incident cases
– Duration ratio bias
– Point prevalence complement ratio bias
• Temporal bias
– Information bias
Incidence-Prevalence bias
• PR과 IRR의 관계
– Prev= incidence X duration X (1-prev)
PR
* Duration ratio bias
* Point prevalence complement
ratio bias
Duration ratio bias
• Type of selection bias
• 드문 질환에서 이환기간이 노출여부와 상관없이
동일하다면 비뚤림 발생하지 않음
• 노출여부에 따라 질병 이환기간이 다를 때 발생
• 만성질환의 경우 질병의 duration이 생존기간과
관련이 있기 때문에 이런 경우 생기는 bias가
survival bias
Point prevalence complement ratio bias
• 이환기간이 동일하다면, PR이 IRR을 과소측정하는 경향이
발생
• 노출그룹의 유병률: 0.04, 비노출그룹 유병률: 0.01
PR : 4
Point prevalence complement ratio=0.96/0.99=0.97
• 노출그룹의 유병률: 0.4, 비노출그룹 유병률: 0.1
PR : 4
Point prevalence complement ratio=0.6/0.9=0.67
• PR, 유병률 크면 → bias 크기 커짐
Selection bias -- Berkson’s bias
• Admission-rate bias
• Cases and/or controls selected from hospitals
• Result from differential rates of hospital admission for cases and
controls
• If hospital based cases and controls have different exposures that
population based, OR will be biased.
• E.g., If hospital controls are less likely to have exposures, OR will be
over-estimated.
• E.g., Case control for pancreatic cancer and coffee drinking: Controls
were selected from GI patients. However, GI patients are less likely
to drink coffee that population. OR was artificially increased.
• Solution: use population based control, or controls with disease not
related to the exposure
Temporal bias
• 시간적 선후관계가 모호
– 질병의 위험요인 검정 측면에서의 결정적 단점
– 예: 영양결핍과 우울증 연구
– 시간적 경과에 따른 변동이 없는 노출요인의 경우에는
이러한 제한점에 구애 받지 않음 – 유전적 요인
• 시간적 선후관계가 뒤집어져 있는 연구는 비추
– 예: 가설) 식이요인이 초경나이에 미치는 영향
대상) 중년여성을 대상으로 초경나이와 최근
의 식이습관 조사
• 전체 유병환자 중 Incident cases만 포함하여 분석함으로
단점을 최소화  또 다른 bias ?
• Historical information 으로 단점 최소화
screening is most likely to pick up less aggressive
cancers, because they have a longer interval of being
visible on scans while remaining asymptomatic
you find out something earlier but don’t actually
change the outcome, and therefore the apparent
survival after diagnosis is longer without better
survival
Simpson’s paradox
aggregated
disaggregated
Simpson’s paradox
• Aggregated and disaggregated data tell two different stories
치료 종류
환자 수
성공
실패
성공률(%)
273
289
77
61
78
83
81
234
6
36
93
87
192
71
73
55
25
69
합계 (n=700)
개복술
350
경피술
350
돌의 크기 < 2cm (n=357)
개복술
87
경피술
270
돌의 크기 ≥ 2cm (n=343)
개복술
263
경피술
80
단면조사연구 정리
 특정 시점 또는 짧은 기간 동안 표본 추출조사 –

“스냅 사진”
장점
 편리하고 비용 효과적
 여러 노출과 질병 연구 가능
 가설 생성 가능
 일반적 인구집단을 대표
 단점
 시간적 선후관계 모호
 생존자만 연구, 비뚤림 가능
 짧은 이환 기간의 질환은 과소측정
Any question?
If you tell the truth you don't have to remember anything.
by Mark Twain 1894

similar documents