### A risk factor is a variable

```Introduction to Risk Factors & Measures of
Effect
Meg McCarron, CDC
Introduction to
Risk Analysis
2
What is a risk analysis?
•
The analysis of an association between a variable (e.g. underlying
condition) and an outcome (e.g. death)
•
Why do risk analysis?
•
The probability of an outcome is often dependent on the interplay
between a variety of factors
•
Follow up on suggested associations observed in descriptive analysis (e.g. the elderly
appear to die more frequently than healthy young adults; a risk analysis might tell you
whether or not that is a true observation)
•
Determine the severity of risk
•
Identify significant risk factors
•
Using this type of analysis we can measure risk ratio (RR), odds ratio (OR)
3
What is a risk factor?

A risk factor is a factor that is associated with increased chance of getting
a disease.

In epidemiological terms: A risk factor is a variable (determinant)
associated with an increased risk of disease or infection (outcome).
 Example: Obesity (determinant/exposure) is associated with
increased risk of heart attack (outcome)

When we measure risk factors we assess
 Strength
 Direction
 Shape
4
Risk factors in SARI surveillance
• Information about a number of potential risk factors
and outcomes is often recorded
• e.g. Outcomes: death, influenza status
• Risk factors: age, co-morbid conditions
• Surveillance data can be analyzed to increase the
understanding of the association of risk factors with
severe outcomes
• Surveillance data describing exposures allows analysis
of associations without expensive in-depth studies
5
Is a risk factor the cause of a
disease?
 Risk factors are correlational and not necessarily
causal
 Correlation does not imply causation
 The statistical methods used do not consider the direction
of effects
 For an effect to be causal the exposure must have
occurred before the outcome
 e.g. young age does not cause measles (Morbillivirus
causes measles), but young people are at greater risk
because they are less likely to have developed immunity
due to previous exposure or vaccination
6
The Correlation-Causation Problem
Somalia has many pirates,
but low carbon emissions
How are risk factors/disease
determinants identified?
 Individual-level data
 Two key variables
 Outcome: e.g. influenza
 Exposure: e.g. vaccination
 Should consider multiple risk factors
 Epidemiological study designs used to identify risk factors
 Case-control
 Cohort
 Surveillance data may approximate a cohort study
 Biological plausibility
 e.g. age and influenza infection
 Exposure (risk factor) must occur prior to outcome (disease)
Types of variables
 Continuous
 E.g. Age
 Categorical variables
 Binary
 E.g. Gender, vaccination status
 Ordinal
 E.g. Age group, socioeconomic status (SES)
 Nominal/Categorical
 E.g. Geographic region
 Count
 E.g. number of ILI symptoms
How are risk factors/disease
determinants identified?
Clinical and epidemiological comparison of hospitalized SARI patients with and without laboratory-confirmed influenza week 40/20xx to
(current week)/20xx, Country X (NOTE: Numbers in table are not real and for example only)
Characteristics
Percent of influenza-negative SARI
hospitalizations with selected
demographic and epidemiological
characteristics
Percent of SARI hospitalizations
confirmed as influenza with selected
demographic and epidemiological
characteristics
Sex
Information available for N = 100
Information available for N = 50
54/100 (54%)
46/100 (46%)
0
27/50 (54%)
23/50 (46%)
0
Information available for N = 98
30/98 (31%)
Information available for N = 48
28/48 (58%)
15/98 (15%)
15/98 (15%)
11/98 (11%)
5/98 (5%)
3/98 (3%)
4/98 (4%)
7/98 (7%)
0/98 (0%)
68/98 (69%)
20/48 (42%)
10/48 (21%)
11/48 (23%)
5/48 (10%)
3/48 (6%)
4/48 (8%)
6/48 (13%)
1/48 (2%)
20/48 (42%)
N=2
Information available for N = 50 women
11/50 (22%)
39/50 (78%)
N=0
Information available for N = 90
N=2
Information available for N = 23 women
8/23 (35%)
15/23 (65%)
N=0
Information available for N = 35
25/90 (28%)
65/90 (72%)
10
Information available for N = 100
40/100(40%)
25/100(25%)
10/100 (10%)
5/100 (5%)
5/100 (5%)
15/100 (15%)
N=0
Information available for N = 98
40/98 (41%)
15/35 (42%)
20/35 (58%)
15
Information available for N = 48
10/48 (21%)
8/48 (17%)
10/48 (21%)
11/48 (23%)
8/48 (16%)
1/48 (2%)
N=2
Information available for N = 40
2/40 (5%)
58/98 (59%)
38/40 (95%)
N=2
Information available for N = 100
10/100 (10%)
N=10
Information available for N = 44
8/44 (18%)
90/100 (90%)
36/44 (82%)
N=0
4.0 days
N=6
4.5 days
Male
Female
Sex unknown
Chronic Medical Illnesses
Number of cases with at least one of the
*
chronic medical illness listed below
Chronic respiratory disease
Asthma
Diabetes
Chronic cardiac disease
Chronic renal disease
Chronic liver disease
Chronic neurological impairment
Immune-compromised
Number of cases without any of the above chronic
medical illnesses
Unknown if risk factors present
Pregnancy status
Pregnancy in any trimester
Not-pregnant
Pregnancy status unknown
Obesity (or other conditions as determined by national
priorities)
Obese (BMI>30 or judged obese clinically)
Not obese (BMI<30 or not clinically judged obese)
Obesity status unknown
Age-groups (years)
0-1
2-4
5-14
15-29
30-64
65+
Age unknown
Vaccination Status
Received monovalent or trivalent vaccine during
the current influenza season
Did not receive monovalent or trivalent vaccine
during the current influenza season
Vaccination status unknown
Oseltamivir/zanamivir (Tamiflu/Relenza) Use
Received oseltamivir/zanamivir within 48 hours of symptom
onset
Did not receive oseltamivir/zanamivir within 48 hours of
symptom onset
Oseltamvir use unknown
Median days from symptom onset to hospital admission
10
How are risk factors/disease
determinants identified? (… continue
…)
Clinical and epidemiological description of hospitalized SARI patients with laboratory-confirmed influenza, by outcome status, year x to
year y, Country X (NOTE: Numbers in table are not real and for example only)
Characteristics
Hospitalised SARI cases with laboratory-confirmed influenza
Sex
Male
Female
Sex unknown
Chronic Medical Illnesses
Number of cases with at least one of the chronic medical
*
illness listed below
Chronic respiratory disease
Asthma
Diabetes
Chronic cardiac disease
Chronic renal disease
Chronic liver disease
Chronic neurological impairment
Immune-compromised
Number of cases without any of the above chronic
medical illnesses
Unknown if risk factors present
Pregnancy status
Pregnancy in any trimester
Not-pregnant
Pregnancy status unknown
Obesity (or other conditions as determined by national
priorities)
Obese (BMI>30 or judged obese clinically)
Not obese (BMI<30 or not clinically judged obese)
Obesity status unknown
Age-groups (years)
0-1
2-4
5-14
15-29
30-64
65+
Age unknown
Vaccination Status
Received monovalent or trivalent vaccine during
the current influenza season
Did not receive monovalent or trivalent vaccine
during the current influenza season
Vaccination status unknown
Oseltamivir/zanamivir (Tamiflu/Relenza) Use
Received oseltamivir/zanamivir within 48 hours of symptom
onset
Did not receive oseltamivir/zanamivir within 48 hours of
symptom onset
Oseltamvir use unknown
Median days from symptom onset to hospital admission
Percent of hospitalized (non-ICU/nonsevere) cases with selected
demographic and epidemiological
characteristics
Percent of severe (severe outcome/or
died) cases with selected
demographic and epidemiological
characteristics
Information available for N = 100
Information available for N = 30
54/100 (54%)
46/100 (46%)
0
15/30 (50%)
15/30 (50%)
0
Information available for N = 98
30/98 (31%)
Information available for N = 28
19/28 (58%)
25/98 (25%)
15/98 (15%)
11/98 (11%)
5/98 (5%)
3/98 (3%)
0/98 (0%)
3/98 (3%)
0/98 (0%)
68/98 (69%)
20/28 (71%)
4/28 (14%)
54/28 (23%)
5/28 (18%)
3/28 (11%)
4/28 (14%)
7/28 (25%)
1/28 (4%)
9/28 (42%)
N=2
Information available for N = 50 women
11/50 (22%)
39/50 (78%)
N=0
Information available for N = 90
N=2
Information available for N = 15 women
10/15(67%)
5/15 (33%)
N=0
Information available for N = 28
23/90 (26%)
66/90 (73%)
10
Information available for N = 100
35/100(35%)
30/100(30%)
10/100 (10%)
4/100 (4%)
6/100 (6%)
15/100 (15%)
N=0
Information available for N = 98
20/98 (20%)
19/28 (68%)
9/28 (32%)
2
Information available for N = 30
5/30 (17%)
2/30 (6%)
5/30 (17%)
3/30 (10%)
10/30 (33%)
5/30 (17%)
N=0
Information available for N = 30
2/30 (7%)
78/98 (80%)
28/30 (93%)
N=2
Information available for N = 100
15/100 (15%)
N=0
Information available for N = 27
2/27 (7%)
85/100 (85%)
25/27 (93%)
N=0
3.5 days
N=3
7.5 days
11
Cohort study
1
D
2
3
Participant
time
 Collect data on their
exposures (risks)
 Monitor their
outcomes
 Compare risk of
disease among
exposed versus
unexposed
D
4
5
6
0
1
2
time
3
4
Example: cohort study
 e.g. Risk of death among SARI admissions
 Outcome: death
 Risk factors: age, underlying conditions, influenzapositive
 Source population: all patients admitted with
SARI, followed until death or discharge
13
Case control study
 Cases: people with disease
 Deliberately over-selected
E
 Controls: people without
disease
 Find out their exposure
status
 Compare risk of exposure
among diseased and nondiseased
E
1
D
2
D
3
4
Participant
 Represent exposure
distribution of the source
population
D
5
E
6
time
14
Example: case-control study
 Risk of influenza among vaccinated patients




Cases: people with influenza
Controls: people without influenza
Outcome: influenza status
Risk factors: vaccination status, age, underlying
comorbidity
15
Statistical significance: is the
association due to chance alone?
 A statistical test is used to assess if an
association may be due to chance alone
(random error)
 In statistics, a result is called statistically
significant if it is unlikely to have occurred by
chance alone, according to a pre-determined
threshold probability, the significance level (e.g. α:
0.05).
16
Common statistical tests
 Categorical data:
 Chi-square (2) test,
 Fisher’s test
 McNemar’s test
 Continuous data:
 T-test
 Wilcoxon rank-sum test
 ANOVA
 These tests can tell if there’s a difference between
groups but do not convey the size or direction of
effects
Common measures of association /
effect
 Measure the size of an association (effect)
 Compare some measure of disease in exposed versus unexposed
 Absolute difference
 Y1-Y2
 Risk difference
 Relative difference (ratio)






Y1/Y2
Odds ratio
Risk ratio
Incidence rate ratio
Hazard ratio (survival data)
Attributable risk
18
Odds ratios
 Most common measure of
association used in
epidemiology
 Binary outcome
 Odds Ratios (OR):
compares the odds of
exposure among cases
(people with disease) with
controls (people without
disease)
 Odds: ratio of the
probability (p) of an event
occurring versus it not
occurring
Calculation of the RR & OR
Cases
Controls
Exposed
a
b
Unexposed
c
d
OR = (a/c) / (b/d)
OR = 1 = no association
OR < 1 = negative association
(reduces risk)
OR > 1 = positive association
(increases risk)
 Odds = p/(1-p)
19
Example of OR Calculations
Outcome
(Influenza patients that died)
Calculation of the RR & OR
Outcome
(Influenza patients that died)
Calculation of the RR & OR
Died
Alive
Flu+
200 (a)
150 (b)
Female
Flu-
50 (c)
100 (d)
Male
Died
Alive
200 (a)
180 (b)
98 (c)
100 (d)
OR = (a/c) / (b/d) = (a*d) / (b*c)
OR=(200/50)/(150/100)=2.7
OR=(200*100)/(180*98)=1.1
20
Confidence intervals
 OR is a point estimate
 Confidence interval (CI) is
a measure of uncertainty
estimate
 CI is based on the
standard error (SE)
 SE=narrower confidence
interval
 If CI includes 1, then not
statistically significant
 wide CI also a problem
 Usually use 95%CI
Cases
Controls
Exposed
a
b
Unexposed
c
d
SE = √1/a + 1/b + 1/c + 1/d
95%CI = e(OR  1.96 * SE)
• OR=1.1
• 95%CI=1.01,1.4
22
Confidence intervals
e.g. 2007 Victorian surveillance data, adults, influenza B
Flu+
Flu-
Vaccinated
44 (a)
95 (b)
Unvaccinated
205 (c)
260 (d)
OR
= (44/205) / (95/260)
= 0.59
ln(OR)
= ln(0.25)
= -0.53
SE
= √1/44 + 1/95+ 1/205 + 1/260
= 0.20
95%CI
= e(-0.53 + 1.96*0.20) = e(0.09)
= e(-0.53 - 1.96*0.20) = e(-2.87)
= 0.39 (UL)
= 0.88 (LL)
Interpreting Results
 Size of the CI is an indicator of uncertainty
 Wide CI = uncertainty
 Narrow CI = uncertainty
 If CI includes 1, then not statistically significant
 The observed effect could just be due to chance
 P-values are often used to convey statistical significance
 The p-value for a OR is calculated from a chi-squared
test
 The p-value reference for a 95%CI is 5% or 0.05
P-values
 The p-values help us to determine whether the
difference between the two groups might be due
to random variation
 CI and p-values
 95%CI=1.0, 2.3 indicates that the two-sided p-value for no
 95%CI=0.9, 2.4 suggests p>0.05
 95%CI=0.9, 2.4 indicate that the data are compatible with a
two-fold higher risk (i.e. upper limit includes 2)
 The p-value is a measure of the compatibility of the data
and the null hypothesis
Implementation of a statistical test
 State the relevant null (H0)
 No effect (effect is due to chance)
 Alternative hypotheses (HA)
 An effect exists
 Decide which test is appropriate (see earlier list)
 Compute the test statistic and the associated p-(probability) value
 Compare the computed p-value to a reference p value (usually
0.05) to accept or reject the null hypothesis
 If the p-value of the test is lower than the reference value the H0 is
rejected
 The effect is not likely to be due to chance
Example: Implementation of a
statistical test
 Influenza prevalence in
hospitalized patients:
 Non pregnant women:
100/1000 = 10%
 Pregnant women: 30/200 =
15%
 Question:
 Is the influenza
prevalence in hospitalized
pregnant women
different to non-pregnant
women?
 Hypothesis
 H0: p1 = p2 ; p1 - p2 = 0
 HA: p1 = p2 ; p1 - p2 = 0
 Reject H0 if p (test) is < α:
0.05
 Test results:
 Z (test statistic): 0.119
 p value: 0.037
 0.037<0.05 → Reject H0
Example: factors associated with influenza-positive
diagnosis among ILI patients
OR
Vaccinated
0.54
95% CI
p-value Lower limit Upper limit
0.02
0.32
0.89
Underlying condition 1.20
0.47OR=0.540.72
2.00
(95%CI=0.32,0.89)
Epi week
1.04
0.01
1.01
1.08
Crude
OR=0.59 (95%CI=0.39,0.88)
Age group
<20
20-64
65+
ref
0.76
1.09
0.17
0.85
0.51
0.45
1.13
2.62
Summary
 A risk factor is a variable which increases (or decreases) the risk of
an outcome
 We can assess the influence of risk factors using individual-level
data from case-control and cohort studies
 The size of the effect can be measured by effect measures
 Most common effect measure is the odds ratio
 The uncertainty of the effect can be measured by the confidence
interval
 Understanding whether an effect is due to random error is indicated
by the p-value and tested using a statistical test
 Multivariable methods can tell us how much influence one risk
factor has compared with others
```