Vempala - IAS Video Lectures

Report
Cool with a Gaussian:
An ∗ (3 ) volume algorithm
Ben Cousins and Santosh Vempala
The Volume Problem
Given a measurable, compact set K in n-dimensional space, find a
number A such that:
1 −  vol  ≤  ≤ 1 +  vol 
K is given by
 a point 0 ∈ , s.t. 0 +  ⊆ 
 a membership oracle: answers YES/NO to “ ∈ ? "
Volume: first attempt

Divide and conquer:

Difficulty: number of parts grows exponentially in n.
More generally: Integration
Input: integrable function f:  → + specified by an
oracle, point x, error parameter ε .
Output: number A such that:
1 −  ∫  ≤  ≤ (1 + )∫ 
Volume is the special case when f is a 0-1 function.
High-dimensional problems





Integration (volume)
Optimization
Learning
Rounding
Sampling
All hopelessly intractable in general, even to approximate.
High-dimensional problems
Input:
 A set of points S in n-dimensional space  
or a distribution in 
 A function f that maps points to real values (could be the
indicator of a set)

What is the complexity of computational problems as the
dimension grows?

Dimension = number of variables

Typically, size of input is a function of the dimension.
Structure
Q. What structure makes high-dimensional problems
computationally tractable? (i.e., solvable with polynomial
complexity)

Convexity and its extensions appear to be the
frontier of polynomial-time solvability.
Volume: second attempt: Sandwiching
Thm (John). Any convex body K has an ellipsoid E s.t.
 ⊆  ⊆ .
E = maximum volume ellipsoid contained in K.
Thm (KLS95). For a convex body K in isotropic position,
+1



⊆⊆
  + 1 
Also a factor n sandwiching, but with a different ellipsoid.
Isotropic position and sandwiching

For any convex body K (in fact any set/distribution with
bounded second moments), we can apply an affine
transformation so that for a random point x from K :
  = 0,
   =  .

Thus K “looks like a ball” up to second moments.

How close is it really to a ball?
K lies between two balls with radii within a factor of n.

Volume via Sandwiching

The John ellipsoid can be approximated using the Ellipsoid
algorithm, s.t.
 ⊆  ⊆ 1.5 

The Inertial ellipsoid can be approximated to within any
constant factor (we’ll see how)

Using either one,
 ⊆  ⊆ (1) 

Polytime algorithm, 

Can we do better?
  ≤   ≤ ()   .

approximation
Complexity of Volume Estimation
Thm [E86, BF87]. For any deterministic algorithm that uses at
most  membership calls to the oracle for a convex body K
and computes two numbers A and B such that A ≤ vol K ≤ B,
there is some convex body for which the ratio B/A is at least

 log 
n
2
where c is an absolute constant.
Thm [DF88]. Computing the volume of an explicit polytope
 ≤  is #P-hard, even for a totally unimodular matrix A and
rational b.
Complexity of Volume Estimation
Thm [BF]. For deterministic algorithms:
# oracle calls
approximation factor
Thm [Dadush-V.13].
Matching upper bound of 1 + 

in time
1  

poly().
Randomized Volume/Integration
[DFK89]. Polynomial-time randomized algorithm that
estimates volume to within relative error 1 +  with
1
1
probability at least 1 −  in time poly(n, , log
).


[Applegate-K91]. Polytime randomized algorithm to
estimate integral of any (Lipshitz) logconcave function.
Volume Computation: an ongoing adventure
Dyer-Frieze-Kannan 89 23
Lovász-Simonovits 90
Applegate-K 90
L 90
DF 91
LS 93
KLS 97
5
LV 03,04
LV 06
Cousins-V. 13, 14
Power
New aspects
everything
16
localization
10
logconcave integration
10
ball walk
8
error analysis
7
multiple improvements
speedy walk, isotropy
4
annealing, wt. isoper.
4
integration, local analysis
3
Gaussian cooling
Does it work?

[Lovász-Deák 2012] implemented [LV] 4 algorithm



worked for cubes up to dimension 9
but too slow after that.
[CV13] Matlab implementation of a new algorithm
Rotated cubes
Volume: third attempt: Sampling

Pick random samples from ball/cube containing K.
Compute fraction c of sample in K.
Output c.vol(outer ball).

Need too many samples!


Volume via Sampling [DFK89]
 ⊆  ⊆ .
Let  =  ∩ 2/ ,  = 0, 1, … ,  =  log .
vol(K1 ) vol(K 2 )
vol(K m )
vol K = vol B .
…
.
vol(K 0 ) vol(K1 ) vol(K m−1 )
Estimate each ratio with random samples.
Volume via Sampling
 =  ∩ 2/ ,
 = 0, 1, … ,  =  log .
vol(K1 ) vol(K 2 )
vol(K m )
vol K = vol B .
…
.
vol(K 0 ) vol(K1 ) vol(K m−1 )
Claim. vol K i+1 ≤ 2. vol K i .
Total #samples =

. 2

But, how to sample?
=  ∗ 2 .
Sampling

Generate



a uniform random point from a compact set S
or with density proportional to a function f.
Numerous applications in diverse areas: statistics,
networking, biology, computer vision, privacy, operations
research etc.
Sampling
Input: function f:  → + specified by an oracle,
point x, error parameter ε.
Output: A point y from a distribution within distance ε
of distribution with density proportional to f.
Logconcave functions

:  → + is logconcave if for any ,  ∈  ,
  + 1 −   ≥     

Examples:





1−
Indicator functions of convex sets are logconcave
Gaussian density function
exponential function
Level sets of f,  =  ∶   ≥  , are convex.
Product, minimum, convolution preserve logconcavity
Algorithmic Applications
Given a blackbox for sampling logconcave densities, we get
efficient algorithms for:




Rounding
Convex Optimization
Volume Computation/Integration
some Learning problems
Rounding via Sampling
Sample m random points from K;
Compute sample mean and sample covariance matrix
1.
2.

3.
 = (  −   −   ).
= 
1
2
−
Output  =  .
B(K-z) is nearly isotropic.
Thm. C().n random points suffice to get 
−
2
[Adamczak et al; improving on Bourgain, Rudelson]
I.e., for any unit vector v,
1 +  ≤  
2
≤ 1 + .
≤ .
How to Sample?
Ball walk:
At x,
-pick random y from  + 
-if y is in K, go to y
Hit-and-Run:
At x,
-pick a random chord L through x
-go to a random point y on L
Complexity of Sampling
Thm. [KLS97] For a convex body, the ball walk with an M-warm
start reaches an (independent) nearly random point in poly(n, R,
M) steps.
0 
 = sup
 
  = 0
0 
 
Thm. [LV03]. Same holds for arbitary logconcave density
functions. From a warm start, complexity is ∗ (2 2 ).

Isotropic transformation makes R=O( ).
KLS volume algorithm:  ×  × 3 = 5
Markov chains




State space K
set of measurable subsets that form a -algebra, i.e.,
closed under finite unions and intersections
A next step distribution  . associated with each point
u in the state space.
A starting point.
0 , 1 , … ,  , … s.t.
  ∈  0 , 1 , … , −1 ) = ( ∈  | −1 )

Convergence
Stationary distribution Q, ergodic “flow” is:
Φ  =

 \A ()
For any subset , we have Φ  = Φ(\A)
Conductance:
∫  \A ()
  =
min   ,  \A
 = inf ()

1
Rate of convergence is bounded by 2 [LS93, JS86].

Conductance
Arbitrary measurable subset S.
How large is the conditional escape probability from S?
Local conductance can be arbitrarily small for the ball walk.
vol  +  ∩ 
ℓ  =
vol( )
Conductance
Need:
 Nearby points have overlapping one-step distributions
 Large subsets have large boundaries [isoperimetry]

 3 ≥  1 , 2 min  1 ,  2

where 2 =  | | 2
Isoperimetry and the KLS conjecture


 3 ≥ (1 , 2 ) min  1 ,  2
A = (( − )( − ) ) : covariance matrix of 
2 = 
−
Thm. [KLS95].
 3 ≥
Conj. [KLS95].
 3 ≥
2

 

1 
=   =
 ()

(1 , 2 ) min  1 , (2 )
(1 , 2 ) min  1 , (2 )
KLS hyperplane conjecture
 = (  )
Conj. [KLS95].
 3 ≥

1 
(1 , 2 ) min  1 , (2 )
• Could improve sampling complexity by a factor of n
• Implies well-known conjectures in convex geometry: slicing
conjecture and thin-shell conjecture
• But wide open!
KLS, Slicing, Thin-shell
thin shell
slicing
KLS
current bound
1/3
[Guedon-Milman]
1/4
[Bourgain, Klartag]
~ 1/3
[Bobkov;
Eldan-Klartag]
All are conjectured to be O(1).
Conjectures are equivalent! [Ball, Eldan-Klartag].
Is rapid mixing possible?
Ball walk can have bad starts, but
Hit-and-run escapes from corners
Min distance isoperimetry is
too coarse
Average distance isoperimetry


How to average distance?
ℎ  ≤   ,  ∶  ∈ 1 ,  ∈ 2 ,  ∈ ℓ(, )
Thm.[LV04]
 3 ≥  ℎ  (1 )(2 )
Hit-and-run mixes rapidly

Thm [LV04]. Hit-and-run mixes in polynomial time from any
starting point inside a convex body.
1


Conductance = Ω

Along with isotropic transformation, gives ∗ 3 sampling
algorithm.
Simulated Annealing [LV03, Kalai-V.04]
To estimate ∫  consider a sequence 0 , 1 , 2 , … ,  = 
with ∫ 0 being easy, e.g., constant function over ball.
Then, ∫  =
∫ 1 ∫ 2
∫ 
∫ 0 .
.
…
.
∫ 0 ∫ 1
∫ −1
Each ratio can be estimated by sampling:
1.
Sample X with density proportional to 
2.
Compute  =
Then,   = ∫
+1 
 
+1   
.
 
∫  
 =
∫ +1
.
∫ 
Annealing [LV06]

Define:
  =  −


0 = 2, +1 =  / 1 +

 ~  log(2/) phases
1

,  =

2
∫ 1 ∫ 2
∫ 
 0 .
.
…
.
∫ 0 ∫ 1
∫ −1

The final estimate could be Ω() , so each ratio could be
  or higher. How can we estimate it with a few
samples?!
Annealing [LV03, 06]


  =  −

0 = 2, +1 =  / 1 +
Lemma.   =

+1 
 
1

,  =

2
< 4   2.
Although expectation of Y can be large (exponential
even), we need only a few samples to estimate it!
LoVe algorithm:  ×  × 3 = 4
Variance of ratio estimator
Let   ,  =
Then,   = ∫
 
 
2
2
 +1 ,
, =
  ,
 +1 ,    ,
.

  , 
∫   , 
 − 
=
∫  +1 ,
∫   ,
2
=
 +1 , 
  , 
∫   , 
 ⋅
2
  , 
∫   , 
∫  +1 , 
=
∫  2+1 −  ,  ∫   , 
∫  +1 , 
2
  1−   1+
=
  2
(would be at most 1, if F was logconcave…)
2
Variance of ratio estimator
Lemma. For any logconcave f and a >0, the function
  =  ∫  ,   is also logconcave.
So  () is logconcave and
( 

)2
≥  1−

( 1 −  )  1 + 
Therefore:
  1−   1+
  2
for  ≤
1
.

≤
1
1− 1+

≤

1
1−2
≤ 4.

( 1 +  )
Volume Computation: an ongoing adventure
Dyer-Frieze-Kannan 89 23
Lovász-Simonovits 90
Applegate-K 90
L 90
DF 91
LS 93
KLS 97
5
LV 03,04
LV 06
Cousins-V. 13, 14
Power
New aspects
everything
16
localization
10
logconcave integration
10
ball walk
8
error analysis
7
multiple improvements
speedy walk, isotropy
4
annealing, wt. isoper.
4
integration, local analysis
3
Gaussian cooling
Gaussian sampling/volume

Sample from Gaussian restricted to K
Compute Gaussian measure of K

Anneal with a Gaussian


Define
  = 
−

2
22

1
, increase


Start with 0 small ~

Compute ratios of integrals of consecutive phases:
∫ +1
∫ 
in phases.
Gaussian sampling

KLS conjecture holds for Gaussian restricted to any convex
body (via Brascamp-Lieb inequality).
Thm.  3 ≥



(1 , 2 ) min  1 , (2 )
Not enough on its own, but can be used to show:
Thm. [Cousins-V. 13]. For  2 =  1 , Ball walk applied to
Gaussian (0,  2  ) restricted to any convex body containing
the unit ball mixes in ∗ 2 time from a warm start.
Speedy walk: a thought experiment

Take sequence of points visited by ball walk:
0 , 1 , 2 , 3 , … ,  , +1 , +3 …

Subsequence of “proper” attempts that stay inside K

This subsequence is a Markov chain and is rapidly mixing
from any point

For a warm start, the total number of steps is only a
constant factor higher
Gaussian volume

Theorem [Cousins-V.13] The Gaussian volume of a
convex body K containing the unit ball can be estimated
in time ∗ 3 .


No need to adjust for isotropy!
Each step samples a 1-d Gaussian from an interval

Can we use this to compute the volume?
Gaussian Cooling

  = 
2
 0

=
−
1
2
, 

Estimate
2

2
22

=  .
∫ +1
∫ 
using samples drawn according to 
≤ 1, set
2
=
2
−1
1+

For

2
For 2 > 1, set 2 = −1
1+
1

−1

Gaussian Cooling

  = 
−

2
22

2
For 2 ≤ 1, we set 2 = −1
1+
1


Sampling time: 2 ,
#phases, #samples per phase: 

So, total time = 2 ×  ×  = 3

Gaussian Cooling

  = 
For




2
−

2
22

> 1, we set
2
=
Sampling time:  2 2
#phases to double  is
2
−1
1+
−1

(too much??)


#samples per phase is also


So, total time to double  is


×


×  2 2 = 3 !!!
Variance of ratio estimator

2
Why can we set 2 as high as −1
1+
2

− 2
2
−1

?
 2 ,  = 
for  ∈ 
 2 = ∫  2 ,  
Lemma.  =
 2 , 
2
 1+, 
 2
  2
for  = 


.
  =
 2

2
1+
2
2


1+
1−

=
=

 2 2
2 
2
= 1
Variance of ratio estimator
 2
  2
2
2
 1+  1−

=
=
2
2
 
First use localization to reduce to 1-d inequality,
for a restricted family of logconcave functions:
For  ∈  ⋅  and − ≤  ≤  ≤ 

 2 =
+
t2
−1  −2 2 dt

2
2
 1+  1−

=
2
2
 
2 2
2
2 
2
Variance of ratio estimator
2
2
 1+  1−

=
2
2
 
2 2
2
where

2
=
 2
∫ 

∫
+
+
⇔
  2
≤
2
d
2
−1 −2 2


2
−1  −2 2 
Warm start

With
2
=
2
−1
1
−1
+

, random point from one
distribution gives a warm start for the next.
 2 ,  = 
 = 
for  = 
2

− 2
2
 
+1 


for  ∈ 
 
 2 = ∫  2 ,  
∫ +1    
=
⋅
⋅
+1 
∫  
∫  
1+
  2 (1 + )   2 ⋅
1 + 2 =  1
=
 2 2
. Same lemma!
Gaussian Cooling [CV14]

Accelerated annealing



1+

Thm. The volume of any well-rounded convex body K can
be estimated using ∗ (3 ) membership queries.
is best possible rate
CV algorithm:


×


×  2 2 × log  = ∗ (3 )
Practical volume/integration
Start with a concentrated Gaussian
Run the algorithm till the Gaussian is nearly flat
In each phase, flatten Gaussian as much as possible while keeping
variance of ratio of integrals bounded
Variance can be estimated with a small constant number of samples
If covariance is skewed (as seen by SVD of O(n) points), scale down
high variance subspace
“Adaptive” annealing (also used in [Stefankovic-Vigoda-V.] for discrete
problems)
Open questions

How true is the KLS conjecture?
Open questions
When to stop a random walk?
(how to decide if current point is “random”?)


Faster isotropy/rounding?

How to get information before reaching stationary?
To make isotropic:
run for N steps; transform using covariance; repeat.
Open questions

How efficiently can we learn a polytope given only random
points?

With O(mn) points, cannot “see” structure, but enough
information to estimate the polytope! Algorithms?
For convex bodies:



[KOS][GR] need 2Ω  points

[Eldan] need 2 even to estimate volume!
Open questions

Can we estimate the volume of an explicit polytope in
deterministic polynomial time?
 ≤ 

similar documents