### Information complexity and exact communication bounds

```Information complexity and
exact communication bounds
Mark Braverman
Princeton University
April 26, 2013
Based on joint work with Ankit Garg,
Denis Pankratov, and Omri Weinstein
1
Overview: information complexity
• Information complexity ::
communication complexity
as
• Shannon’s entropy ::
transmission cost
2
Background – information theory
• Shannon (1948) introduced information
theory as a tool for studying the
communication channel
Alice
Bob
3
Shannon’s entropy
• Assume a lossless binary channel.
• A message  is distributed according to
some prior .
• The inherent amount of bits it takes to
transmit  is given by its entropy
=   =  log 2 (1/[ = ]).
X
communication channel
4
Shannon’s noiseless coding
• The cost of communicating many copies of
scales as ().
• Shannon’s source coding theorem:
– Let   be the cost of transmitting
independent copies of . Then the
amortized transmission cost
lim  ()/ =   .
→∞
5
Shannon’s entropy – cont’d
• Therefore, understanding the cost of
transmitting a sequence of ’s is equivalent
to understanding Shannon’s entropy of .
• What about more complicated scenarios?
X
communication channel
Y
• Amortized transmission cost = conditional
entropy    ≔   − ().
A simple
Easy and
example
complete!
• Alice has  uniform 1 , 2 , … ,  ∈ 1,2,3,4,5 .
• Cost of transmitting to Bob is
≈ log 2 5 ⋅  ≈ 2.32.
• Suppose for each  Bob is given a unifomly
random  ∈ 1,2,3,4,5 such that  ≠
then…
cost of transmitting the  ’s to Bob is
≈ log 2 4 ⋅  = 2.
7
Meanwhile, in a galaxy far far away…
Communication complexity [Yao]
• Focus on the two party randomized setting.
Shared randomness R
A & B implement a
functionality (, ).
X
Y
F(X,Y)
A
e.g.  ,  = “ = ? ”
B
8
Communication complexity
Goal: implement a functionality (, ).
A protocol (, ) computing (, ):
Shared randomness R
m1(X,R)
m2(Y,m1,R)
m3(X,m1,m2,R)
X
Y
A
Communication costF(X,Y)
= #of bits exchanged.
B
Communication complexity
• Numerous applications/potential
applications (streaming, data structures,
circuits lower bounds…)
• Considerably more difficult to obtain lower
bounds than transmission (still much easier
than other models of computation).
• Many lower-bound techniques exists.
• Exact bounds??
10
Communication complexity
• (Distributional) communication complexity
with input distribution  and error :
, ,  . Error ≤  w.r.t. .
• (Randomized/worst-case) communication
complexity: (, ). Error ≤  on all inputs.
• Yao’s minimax:
,  = max (, , ).

11
Set disjointness and intersection
Alice and Bob each given a set  ⊆ 1, … ,  ,  ⊆
{1, … , } (can be viewed as vectors in 0,1  ).
• Intersection  ,  =  ∩ .
• Disjointness  ,  = 1 if  ∩  = ∅, and 0
otherwise.
•  is just  1-bit-ANDs in parallel.
• ¬ is an OR of  1-bit-ANDs.
• Need to understand amortized communication
complexity (of 1-bit-AND).
Information complexity
• The smallest amount of information Alice
and Bob need to exchange to solve .
• How is information measured?
• Communication cost of a protocol?
– Number of bits exchanged.
• Information cost of a protocol?
– Amount of information revealed.
13
Basic definition 1: The
information cost of a protocol
• Prior distribution: ,  ∼ .
Y
X
Protocol
Protocol π
transcript Π
B
A
(, ) = (Π; |) + (Π; |)
Mutual information
• The mutual information of two random
variables is the amount of information
knowing one reveals about the other:
(; ) = () − (|)
• If ,  are independent, (; ) = 0.
• (; ) = ().
H(A)
I(A,B)
H(B)
15
Basic definition 1: The
information cost of a protocol
• Prior distribution: ,  ∼ .
Y
X
Protocol
Protocol π
transcript Π
B
A
(, ) = (Π; |) + (Π; |)
Example
• is “ = ? ”.
• is a distribution where w.p. ½  =  and w.p.
½ (, ) are random.
Y
X
MD5(X) [128 bits]
X=Y? [1 bit]
A
B
(, ) = (Π; |) + (Π; |) ≈ 1 + 64.5 = 65.5 bits
Information complexity
• Communication complexity:
, ,  ≔
min

ℎ  ≤
.
• Analogously:
, ,  ≔
inf

ℎ  ≤
(, ).
18
Prior-free information complexity
• Using minimax can get rid of the prior.
,  = max (, , ).

• For information
,  ≔
inf

ℎ  ≤
max (, ).

19
Connection to privacy
• There is a strong connection between
information complexity and (informationtheoretic) privacy.
• Alice and Bob want to perform
computation without revealing
unnecessary information to each other (or
to an eavesdropper).
• Negative results through  arguments.
20
Information equals amortized
communication
• Recall [Shannon]: lim  ()/ =   .
→∞

• [BR’11]: lim ( ,  , )/ =  , ,  , for
> 0.
→∞
• For  = 0: lim (  ,  , 0+ )/ =  , , 0 .
→∞
• [ lim (  ,  , 0)/ an interesting open
→∞
question.]
21
Without priors
• [BR’11] For  = 0:
lim (  ,  , 0+ )/ =  , , 0 .
→∞
• [B’12] lim (  , 0+ )/ =  , 0 .
→∞
22
Intersection
• Therefore
, 0+ =  ⋅  , 0 ± ()
• Need to find the information complexity of
the two-bit !
23
The two-bit AND
• [BGPW’12]  , 0 ≈ 1.4922 bits.
• Find the value of  , , 0 for all
priors .
• Find the information-theoretically optimal
protocol for computing the  of two
bits.
24
The optimal protocol for AND
X {0,1}
1
Y {0,1}
B
A
If X=1, A=1
If X=0, A=U[0,1]
0
If Y=1, B=1
If Y=0, B=U[0,1]
The optimal protocol for AND
X {0,1}
1
Y {0,1}
B
A
If X=1, A=1
If X=0, A=U[0,1]
0
If Y=1, B=1
If Y=0, B=U[0,1]
Analysis
• An additional small step if the prior is not
symmetric (Pr  = 10 ≠ Pr[ = 01]).
• The protocol is clearly always correct.
• How do we prove the optimality of a
protocol?
• Consider the function (, , 0) as a
function of .
27
The analytical view
• A message is just a mapping from the
current prior to a distribution of posteriors
(new priors). Ex:
“0”: 0.6

X=0
X=1
Y=0
0.4
0.3
0
Y=0
Y=1
X=0
2/3
1/3
X=1
0
0
1
Y=0
Y=1
X=0
0
0
X=1
0.75
0.25
Y=1
0.2
0.1
“1”: 0.4
Alice sends
her bit
28
The analytical view
“0”: 0.55

X=0
X=1
Y=0
0.4
0.3
0
Y=0
Y=1
X=0
0.545
0.273
X=1
0.136
0.045
1
Y=0
Y=1
X=0
2/9
1/9
X=1
1/2
1/6
Y=1
0.2
0.1
“1”: 0.45
Alice sends her bit
w.p ½ and unif.
random bit w.p ½.
29
Analytical view – cont’d
• Denote Ψ  ≔ (, , 0).
• Each potential (one bit) message  by either
party imposes a constraint of the form:
Ψ
≤  ,  + Pr  = 0 ⋅ Ψ 0
+ Pr  = 1 ⋅ Ψ 1 .
• In fact, Ψ  is the point-wise largest function
satisfying all such constraints (cf. construction
30
of harmonic functions).
IC of AND
• We show that for  described above,
Ψ  ≔   ,  satisfies all the
constraints, and therefore represents the
information complexity of  at all priors.
• Theorem:  represents the informationtheoretically optimal protocol* for
computing the  of two bits.
31
*Not a real protocol
• The “protocol” is not a real protocol (this is
why IC has an inf in its definition).
• The protocol above can be made into a real
protocol by discretizing the counter (e.g.
into  equal intervals).
• We show that the -round IC:
1
, 0 =  , 0 + Θ 2 .

32
Previous numerical evidence
• [Ma,Ishwar’09] – numerical calculation results.
33
Applications: communication
complexity of intersection
• Corollary:
, 0+ ≈ 1.4922 ⋅  ±   .
• Moreover:

+
, 0 ≈ 1.4922 ⋅  + Θ 2 .

34
Applications 2: set disjointness
• Recall:  ,  = 1∩=∅ .
• Extremely well-studied. [Kalyanasundaram
and Schnitger’87, Razborov’92, Bar-Yossef
et al.’02]:   ,  = Θ().
• What does a hard distribution for
look like?
35
A hard distribution?
0
0
1
1
0
1
0
0
0
1
0
0
1
1
1
1
0
1
1
0
0
1
0
1
0
0
1
1
1
0
0
1
1
1
0
1
0
1
1
0
0
0
Very easy!

Y=0
Y=1
X=0
1/4
1/4
X=1
1/4
1/4
36
A hard distribution
0
0
0
1
0
1
0
0
0
1
0
0
1
1
0
1
0
1
1
0
0
1
0
1
0
0
0
1
1
0
0
1
1
1
0
0
0
1
0
0
0
0
At most one
(1,1) location!

Y=0
Y=1
X=0
1/3
1/3
X=1
1/3
0+
37
Communication complexity of
Disjointness
• Continuing the line of reasoning of BarYossef et. al.
• We now know exactly the communication
complexity of Disj under any of the “hard”
prior distributions. By maximizing, we get:
•   , 0+ =  ⋅  ± (), where
≔ max  ,  ≈ 0.4827 …
: 1,1 =0
• With a bit of work this bound is tight.
38
Small-set Disjointness
• A variant of set disjointness where we are
given ,  ⊂ {1, … } of size  ≪ .
• A lower bound of Ω  is obvious (modulo
= Ω()).
• A very elegant matching upper bound was
39
Using information complexity
• This setting corresponds to the prior
distribution
•
•
,
Y=0
Y=1
X=0
1-2k/n
k/n
X=1
k/n
0+
2
Gives information complexity
ln 2
2
Communication complexity
⋅
ln 2
⋅

;

±  .
Overview: information complexity
• Information complexity ::
communication complexity
as
• Shannon’s entropy ::
transmission cost
Today: focused on exact bounds using IC.
41
Selected open problems 1
• The interactive compression problem.
• For Shannon’s entropy we have

→  .

• E.g. by Huffman’s coding we also know that
≤   <   + 1.
• In the interactive setting
,  , 0+
→  , , 0 .

• But is it true that (, , 0+ ) ≲  , , 0 ??
Interactive compression?
• (, , 0+ ) ≲  , , 0 is equivalent to
, ,0+
(, , 0+ ) ≲
, the “direct sum”

problem for communication complexity.
• Currently best general compression scheme
[BBCR’10]: protocol of information cost
and communication cost  compressed to
(  ⋅ ) bits of communication.
43
Interactive compression?
• (, , 0+ ) ≲  , , 0 is equivalent to
, ,0+
(, , 0+ ) ≲
, the “direct sum”

problem for communication complexity.
• A counterexample would need to separate
IC from CC, which would require new lower
bound techniques [Kerenidis, Laplante,
Lerays, Roland, Xiao’12].
44
Selected open problems 2
• Given a truth table for , a prior , and an
≥ 0, can we compute (, , )?
• An uncountable number of constraints,
need to understand structure better.
• Specific ’s with inputs in 1,2,3 × {1,2,3}.
• Going beyond two players.
45
External information cost
• (, ) ~ .
X
C
Y
Protocol
Protocol π
transcript Π
A
B
,  =  Π;  ≥  Π;   + (Π; |)
what Charlie learns about (, )
External information complexity
•  , , 0 ≔
inf

(, ).
• Conjecture: Zero-error communication scales
like external information:
, ,0
lim

→∞
=  , , 0 ?
• Example: for  / this value is
log 2 3 ≈ 1.585 > 1.492.
47
Thank You!
48
```