Data

```Distributed Nuclear Norm Minimization
for Matrix Completion
Morteza Mardani, Gonzalo Mateos and Georgios Giannakis
ECE Department, University of Minnesota
Acknowledgments: MURI (AFOSR FA9550-10-1-0567) grant
Cesme, Turkey
June 19, 2012
1
Learning from “Big Data”
`Data are widely available, what is scarce is the ability to extract wisdom from them’
BIG
Ubiquitous
Fast
Productive
Smart
Messy
Revealing
K. Cukier, ``Harnessing the data deluge,'' Nov. 2011.
2
Context
Preference modeling
 Imputation of network data
Smart metering
Network cartography
Goal: Given few incomplete rows per agent, impute missing entries
in a distributed fashion by leveraging low-rank of the data matrix.
3
Low-rank matrix completion
 Consider matrix
, set
 Sampling operator
?
 Given incomplete (noisy) data
(as)
has low rank
 Goal: denoise observed entries, impute missing ones
?
?
? ?
? ?
?
?
?
?
?
 Nuclear-norm minimization [Fazel’02],[Candes-Recht’09]
Noisy
Noise-free
s.t.
4
Problem statement
 Network: undirected, connected graph
?
?
Goal: Given
?
? ?
? ?
?
?
n
?
per node
and single-hop exchanges, find
(P1)
 Challenges
 Nuclear norm is not separable
 Global optimization variable
5
Separable regularization
 Key result [Recht et al’11]
Lxρ
≥rank[X]
 New formulation equivalent to (P1)
(P2)
 Nonconvex; reduces complexity:
Proposition 1. If
then
stationary pt. of (P2) and
is a global optimum of (P1).
,
6
Distributed estimator
(P3)
Consensus with
neighboring nodes
 Network connectivity (P2)
(P3)
 Alternating-directions method of multipliers (ADMM) solver
 Method [Glowinski-Marrocco’75], [Gabay-Mercier’76]
 Learning over networks [Schizas et al’07]
 Primal variables per agent
:
n
 Message passing:
7
Distributed iterations
8
Attractive features
 Highly parallelizable with simple recursions
 Unconstrained QPs per agent
 No SVD per iteration
 Low overhead for message exchanges

is
and is small
 Comm. cost independent of network size
Recap:
(P1)
(P2)
(P3)
Centralized
Convex
Sep. regul.
Nonconvex
Consensus
Nonconvex
Stationary (P3)
Stationary (P2)
Global (P1)
9
Optimality
Proposition 2. If
and
i)
ii)
converges to
, then:
is the global optimum of (P1).
 ADMM can converge even for non-convex problems [Boyd et al’11]
 Simple distributed algorithm for optimal matrix imputation
 Centralized performance guarantees e.g., [Candes-Recht’09] carry over
10
Synthetic data
 Random network topology
 N=20, L=66, T=66
1
0.8
0.6
0.4
 Data


0.2
,
0
0
0.2
0.4
0.6
0.8
1
,
11
Real data
 Network distance prediction [Liau et al’12]
 Abilene network data (Aug 18-22,2011)
 End-to-end latency matrix
 N=9, L=T=N
 80% missing data
Relative error: 10%
Data: http://internet2.edu/observatory/archive/data-collections.html
12
```