### 社交图挖掘

```Chapter 10
Mining Social-Network Graphs
 主讲人：李玉国
There is much information to be gained by analyzing the large-scale data
that is derived from social networks. In this chapter, we shall study techniques
for discovering communities , discovering similarities among nodes , counting
triangles , measuring the neighborhood size of nodes , and show the effect of
those properties above.
 Part 1: Introduction of Socail Network Section 10.1
 Part 2: Communities of social graphs Section 10.2,10.3,10.4,10.5
 Part 3: Similarities among nodes
Section 10.6
 Part 4: Triangles number of a social graph Section 10.7
 Part 5: Neighborhood size of nodes in a social graph Section 10.8
10.1 Social Networks as Graphs

Collaboration network等。
part2 Discovering communities
10.2 Clustering of social network graphs

1.层次聚类

while（为达到结束条件）

end
Section 10.2
2.点分配聚类

while（没有剩余的点）

end

communities {A, B, C}
communities {D, E, F, G}

Section 10.2

communities。

Section 10.2

BFS

1.每个点的bt初始为1
2.非leaf=1+所有以它

3. 边 e=e 的 下 端 点 的

+Pk）
10.3 Direct discovery of communities(Frequent Itemsets)

Ks,t(其寻找过程是一个频繁项集挖掘过程，一个节点数是t的项集，至少在s个购物篮

a = {1, 4}，b = {2, 3}, c = {1} and d = {3}
If s = 2 and t = 1,
K2，1= { 1 a ，1
or{ 3
b ，3
c}
d}
10.4 Partitioning of Graphs（matrix theory）

S
T
S = {H} T = {A, B, C, D, E, F, G}
Cut (S, T ) = 1
Vol (S) = 1
Vol (T ) = 11
1+1/11=1.09
S = {A,B,C,H} T = {D, E, F, G}
Cut (S, T ) = 2
Vol (S) = 6
Vol (T ) = 7
2/6+2/17=0.62
Section 10.4

Laplacian matrix： L = D − A（D是图的邻接矩阵，A是图的等级矩阵）
-
=
L
D
A
Section 10.4

10.5 Finding overlapping communities

An example
Consider the tiny social graph . Suppose there are two communities C and D,
with associated probabilities pC and pD. Also, suppose that we have determined
(or are using as a temporary hypothesis) that C = {w, x, y} and D = {w, y, z}.
Pd=0.5
10.6 Random Walks with Restart

Section10.6
Transition matrix for the graph above：
Iterate
Distribution of the walker that we get
10.7 Counting Triangles

1.The first algorithm for finding triangles

条边，我们称这个点为hitter。可以看出，本图中最多只有2  个hitter。

（1）记录每一个点的degree，O(m)；
（2）记录每一条边，以边的两个端点为key，O(m)；边的两个端点是有序的）
（3）记录与一个点相来连接的所有点。

Section10.7

for（each e）{//m条边
if(e.1不是hitter){
for（  ）{//e.1的所有邻接点，最多有 个
if（邻接点和e.2之间有边）{
count++；
}
}
}
}

Section10.7
2.通过MapReduce实现

z
y
x
10.8 Neighborhood properties of Graphs

Neighborhood：给定一个点v和长度d，从v出发经过小于等于d的长度到达的所有点

（u，v）属于传递闭包。

Section 10.8
1. 求传递闭包
1.1 Transitive closure via mapreduce
log2d rounds
1.2 Transitive closure by graph reduction

2014.09.11
```