(Better) Bootstrap Confidence Intervals

```TAU Bootstrap Seminar 2011
Dr. Saharon Rosset
(Better)
Bootstrap Confidence Intervals
Shachar Kaufman
Based on Efron and Tibshirani’s
“An introduction to the bootstrap”
Chapter 14
Agenda
• What’s wrong with the simpler intervals?
• The (nonparametric) BCa method
• The (nonparametric) ABC method
– Not really
≔ var

=0  −
≔

2
Under the assumption that
,  ~ , Σ i.i.d.
Under the assumption that
,  ~ i.i.d.
 Have exact analytical interval
 Can do parametric-bootstrap
 Can do nonparametric bootstrap
Why are the simpler intervals bad?
• Standard (normal) confidence interval
assumes symmetry around
• Bootstrap-t often erratic in practice
– “Cannot be recommended for general nonparametric
problems”
• Percentile suffers from low coverage
– Assumes nonp. distribution of  ∗ is representative of
(e.g. has mean  like  does)
• Standard & percentile methods assume
homogenous behavior of , whatever  is
– (e.g. standard deviation of  does not change with )
A more flexible inference model
Account for higher-order
statistics
Mean
Standard deviation
Skewness
∗
A more flexible inference model
• If ~ ,  2 doesn’t work for the data, maybe we could
find a transform  ≔   and constants 0 and  for
which we can accept that
~  − 0  , 2
≔ 1 +
–  ⋅ allows a flexible parameter-description scale
– 0 allows bias: ℙ  <  = Φ 0
–  allows “ 2 ” to change with
• As we know, “more flexible” is not necessarily “better”
• Under broad conditions, in this case it is (TBD)
Where does this new model lead?
~  − 0  , 2
≔ 1 +

Assume known  and 0 = 0, and initially that  = ,0 ≔ 0, hence
,0 ≔ 1
Calculate a standard -confidence endpoint from this

≔ Φ−1  , ,1 ≔   ,0 =
Now reexamine the actual stdev, this time assuming that

= ,1
According to the model, it will be

,1 ≔ 1 + ,1 = 1 +

Where does this new model lead?
~  − 0  , 2
≔ 1 +
Ok but this leads to an updated endpoint

,2 ≔   ,1 =   1 +

2
,2 = 1 +
1 +
= 1 +  +
If we continue iteratively to infinity this way we end up with
the confidence interval endpoint

,∞ =
1 −
Where does this new model lead?
• Do this exercise considering 0 ≠ 0 and get

lo,∞
• Similarly for
0 +
= 0 +
1 −  0 +

up,∞
with
1−

Enter BCa
• “Bias-corrected and accelerated”
• Like percentile confidence interval
– Both ends are percentiles  ∗
bootstap instances of  ∗
– Just not the simple
1 ≔
2 ≔ 1 −
1
, ∗
2
of the
BCa
0 +
1 ≔ Φ 0 +
1 −  0 +

0 +  1−
2 ≔ Φ 0 +
1 −  0 +  1−
• 0 and  are parameters we will estimate
– When both zero, we get the good-old percentile CI
• Notice we never had to explicitly find  ≔
BCa
• 0 tackles bias ℙ  <  = Φ 0
0 ≔ Φ−1
# ∗  <

(since  is monotone)
•  accounts for a standard deviation of  which
varies with  (linearly, on the “normal scale” )
BCa
• One suggested estimator for  is via the jackknife

=1
≔
6
where
and
⋅

≔

=1

−
−
3
⋅
2 1.5
⋅
≔   without sample
1

=1
• You won’t find the rationale behind this formula in the
book (though it is clearly related to one of the standard
ways to define skewness)
• Transformation respecting
– If the interval for  is lo , up then the interval
for a monotone   is  lo ,  up
– So no need to worry about finding transforms of
where confidence intervals perform well
• Which is necessary in practice with bootstrap-t CI
• And with the standard CI (e.g. Fisher corrcoeff trans.)
• Percentile CI is transformation respecting
• Accuracy

– We want lo s.t. ℙ  < lo =
– But a practical lo is an approximation where

ℙ  < lo ≅
– BCa (and bootstrap-t) endpoints are “second order
accurate”, where
1

ℙ  < lo =  +

– This is in contrast to the standard and percentile
1
methods which only converge at rate (“first order

accurate”)  errors one order of magnitude greater
But BCa is expensive
• The use of direct bootstrapping to calculate
delicate statistics such as 0 and  requires a
large  to work satisfactorily
• Fortunately, BCa can be analytically
approximated (with a Taylor expansion, for
differentiable   ) so that no Monte Carlo
simulation is required
• This is the ABC method which retains the good
theoretical properties of BCa
The ABC method
• Only an introduction (Chapter 22)
• Discusses the “how”, not the “why”
• For additional details see Diciccio and Efron
1992 or 1996
The ABC method
• Given the estimator in resampling form
=
– Recall , the “resampling vector”, is an  dimensional
random variable with components  ≔ ℙ  = 1∗
– Recall
0
≔
1 1
1
, ,…,

• Second-order Taylor analysis of the estimate
– as a function of the bootstrap resampling
methodology

≔
,  ≔
0
0
=
=
The ABC method
• Can approximate all the BCa parameter estimates (i.e.
estimate the parameters in a different way)
1
2
– =
– =
1
6

2

=1
1
2

3
=1
2

2 3
=1
– 0 =  − , where

•  ≔  −
1
•  ≔ 22

=1
•  ≔something akin to a Hessian component but along a specific
direction not perpendicular to any natural axis (the “least favorable
family” direction)
The ABC method
• And the ABC interval endpoint

1− ≔  +

0
• Where
–≔
–≔

1− 2
0
with  ≔ 0 +
1−
• Simple and to the point, aint it?
```