### Application of the Empirical Bayes Method

```The Empirical Bayes Method
for Safety Estimation
Doug Harwood
MRIGlobal
Kansas City, MO
Key Reference
Hauer, E., D.W. Harwood, F.M. Council, M.S.
Griffith, “The Empirical Bayes method for
estimating safety: A tutorial.”
Transportation Research Record 1784, pp.
126-131. National Academies Press,
Washington, D.C.. 2002
http://www.ctre.iastate.edu/educweb/CE55
2/docs/Bayes_tutor_hauer.pdf
The Problem

You are a safety engineer for a highway
agency. The agency plans next year to
implement a countermeasure that will
reduce crashes by 35% over the next
three years. To estimate the benefits
of this countermeasure, what safety
measure will you multiply by 0.35?
What Do We Need To Know?
You need to know – or, rather, estimate
– what would be expected to happen in
the future if no action is taken
 Then, you can apply crash modification
factors (CMFs) for the known effects of
planned actions to estimate their effects
quantitatively

Common Approach:
Use Last 3 Years of Crash Data
Observed
Crashes
2008
2009
2010
30
19
21
More Data Gives a Different Result
Observed
Crashes
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
22
23
16
16
9
14
17
30
19
21
RTM Example with Average Observed
Crashes
7
6
Crashes
5
3- Year average (Xa)
Long-term
average (m)
Random
error
4
3
2
1
0
1993
1995
1997
1999
Year
2001
2003
2005
“True Safety Impact of a Measure”
7
3-year average
‘before’ (Xa)
6
Long-term
average (m)
Crashes
5
Observed safety
effect
4
True safety
effect
3
2
3-year
average
‘after’
1
0
1993
1995
1997
1999
2001
Year
2003
2005
2007
Regression to the mean problem …
High crash locations are chosen for one reason
(high number of crashes!) – might be truly high or
might be just random variation
 Even with no treatment, we would expect, on
average, for this high crash frequency to
decrease
 This needs to be accounted for, but is often not,
e.g., reporting crash reductions after treatment
by comparing before and after frequencies over
short periods

The “imprecision” problem …
Assume 100 crashes per year,
and 3 years of data, we can
reliably estimate the number of
crashes per year with (Poisson)
standard deviation of about…
or 5.7% of the mean
However, if there are relatively
few crashes per time period (say, 1
crash per 10 years) the estimate
varies greatly …
or 180% of the mean!
Things change…
BEWARE about assuming that
everything will remain the same ….
 Future conditions will not be identical to
past conditions
 Most especially, traffic volumes will
likely change
 Past trends can help forecast future
volume changes

Focus on Crash Frequency vs. AADT
Relationships: Use of Crash Rates May
Be Misleading
30
F1
Crash Frequency
25
R1
20
C1
F2
15
F3
10
E1
5
C2
E2
0
0
5000
10000
15000
AADT
20000
25000
Before
30000
After
The Empirical Bayes Approach

Empirical Bayes: an approach to estimating
what will crashes will occur in the future if
no countermeasure is implemented (or what
would have happened if no countermeasure
had been implemented)
 Simply
assuming that what occurred in a recent
short-term “before period” will happen again in
the future is naïve and potentially very
inaccurate
 Yet, this assumption has been the norm for
many years
The Empirical Bayes Approach
The observed crash history for the site
being analyzed is one useful and important
piece of information
 What other information do we have
available?

The Empirical Bayes Approach
We know the short-term crash history for the
site
 The long-term average crash history for that
site would be even better, BUT…

 Long-term
crash records may not available
 If the average crash frequency is low, even the longterm average crash frequency may be imprecise
 Geometrics, traffic control, lane use, and other site
conditions change over time

We can get the crash history for other similar
sites, referred to as a REFERENCE GROUP
Empirical Bayes
Increases precision
 Reduced RTM bias
 Uses information from the site, plus …
 Information from other, similar sites

Safety Performance Functions
SPF = Mathematical relationship between
crash frequency per unit of time (and road
length) and traffic volumes (AADT)
30
Crash Frequency
25
20
15
10
5
0
0
5000
10000
15000
20000
25000
30000
AADT
3-17
How Are SPFs Derived?
SPFs are developed using negative binomial
regression analysis
 SPFs are based on several years of crash
data
 SPFs are specific to a given reference group
of sites and severity level

 Different
road types = different SPFs
 Different severity levels = different SPFs
3-18
The overdispersion parameter
The negative binomial is a generalized Poisson where
the variance is larger than the mean (overdispersed)
 The “standard deviation-type” parameter of the
negative binomial is the overdispersion parameter φ
 variance = η[1+η/(φL)]
 Where …

 μ=average
crashes/km-yr (or /yr for intersections)
 η=μYL (or μY for intersections) = number of crashes/time
 φ=estimated by the regression (units must be
complementary with L, for intersections, L is taken as one)
SPF Example
Regression model for total crashes
at rural 4-leg intersections with
minor-road STOP control
Np= exp(-8.69 + 0.65 lnADT1 + 0.47 lnADT2)
where:
Np = Predicted number of intersection-related crashes
per year within 250 ft of intersection
ADT1 = Major-road traffic flow (veh/day)
ADT2 = Minor-road traffic flow (veh/day)
3-20
Calculating the Long-Term Average
Expected Crash Frequency
The estimate of expected crash
frequency:
Ne
Expected
Accident
Frequency
=
w (Np)
Predicted
Accident
Frequency
+
(1 – w) (No)
Observed
Accident
Frequency
Weight (w; 0<w<1) is calculated from
the overdispersion parameter
3-21
Weight (w) Used in EB
Computations
w = 1 / ( 1 + k Np)
w = weight
k = overdispersion parameter for the
SPF
Np = predicted accident frequency for
site
3-22
Graphical Representation of the EB Method
3-23
Predicting Future Safety Levels from
Past Safety Performance
Ne(future) = Ne(past) x (Np(future) / Np(past))
Ne = expected accident frequency
Np = predicted accident frequency
3-24
Predicting Future Safety Levels from
Past Safety Performance

The Np(future)/Np(past) ratio can reflect
changes in:
 Traffic
volume
 Countermeasures (based on CMFs)
3-25
CMFs—How to Use Them

CMFs are expressed as a decimal
factor:
 CMF
of 0.80 indicates a 20% crash
reduction
 CMF of 1.20 indicates a 20% crash increase
CMFs—How to Use Them

Expected crash frequencies and CMFs
can be multiplied together:
Ne(with) = Ne(without) CMF
Crashes Reduced = Ne(without) - Ne(with)
CMFs—Single Factor

CMF for shoulder rumble strips
 Rural
freeways (CMFTOT = 0.79)
Ne(with) = Ne(without) x 0.79
3-28
CMF Functions
CMFs for Lane Width (two-lane rural roads) (Harwood et al.,
2000)
3-29
CMFs for Combined
Countermeasures

CMFs can be multiplied together if their
effects are independent:
Ne(with) = Ne(without) CMF1 CMF2
Are countermeasure effects
independent?
EB applications
HSM
 IHSDM
 Safety Analyst

EB applications
HSM Part C
 Estimate long-term expected crash frequency
for a location under current conditions
 Estimate long-term expected crash frequency
for a location under future conditions
 Estimate long-term expected crash frequency
for a location under future conditions with
one or more countermeasures in place
HSM Part B
 Evaluate countermeasure effectiveness using
before and after data
EB applications
Site-Specific EB Method
 Based on equations in this presentation
Project-Level EB Method
 If project is made up of components
with different SPFs, then there is no
single value of k, the overdispersion
parameter
EB Before-After Effectiveness Evaluation
 See Chapter 9 in HSM Part B
Questions?
```