Slides - Computer Science

Report
 Definition
 Data Structures
 Applications
 Problems
 Graph Pattern Matching
 Partitioning
 Distribution
 AKKA
Definition:
 A "graph" is a collection of
• "vertices" or "nodes"
• " edges " that connect pairs of vertices.
 A graph may be undirected, meaning that there is no distinction between
the two vertices associated with each edge, or its edges may be directed
from one vertex to another
1
3
2
1
3
2





The link structure of a website could be represented by a directed graph. The
vertices are the web pages available at the website and a directed edge from page A
to page B exists if and only if A contains a link to B.
Mathematical PageRanks for a simple network, expressed as percentages. (Google
uses a logarithmic scale.) Page C has a higher PageRank than Page E, even though
there are fewer links to C; the one link to C comes from an important page and
hence is of high value.
graph theory is also used to study molecules in
chemistry and physics. In condensed matter physics,
the three dimensional structure of complicated
Simulated atomic structures.
Image Processing, crime detection
Antology
…
The data structure used depends on both the graph structure and the algorithm used
for manipulating the graph. Theoretically one can distinguish between list and
matrix structures but in concrete applications the best structure is often a combination
of both.
 List structures are often preferred for sparse graphs as they have smaller memory
requirements.
 Matrix structures on the other hand provide faster access for some applications but
can consume huge amounts of memory.
List

Incidence list
The edges are represented by an array containing pairs (tuples if
directed) of vertices (that the edge connects) and possibly weight and other
data. Vertices connected by an edge are said to be adjacent.
((a,b),(c,d),…)

Adjacency list
Much like the incidence list, each vertex has a list of which vertices it is
adjacent to. This causes redundancy in an undirected graph: for example,
if vertices A and B are adjacent, A's adjacency list contains B, while B's list
contains A. Adjacency queries are faster, at the cost of extra storage space.
a
adjacent to
b,c
b
adjacent to
a,c
c
adjacent to
a,b
Matrix

Incidence matrix
The graph is represented by a matrix of size |V | (number of vertices) by |E| (number
of edges) where the entry [vertex, edge] contains the edge's endpoint data (simplest
case: 1 - incident, 0 - not incident).
e1 e2 e3 e4
1
2
3
4

Adjacency matrix
This is an n by n matrix A, where n is the number of vertices in the graph. If there is
an edge from a vertex x to a vertex y, then the element is 1 (or in general the number
of xy edges), otherwise it is 0. In computing, this matrix makes it easy to find
subgraphs, and to reverse a directed graph.
Matrix

Distance matrix
A symmetric n by n matrix D, where n is the number of vertices in the
graph. The element is the length of a shortest path between x and y; if there
is no such path = infinity. It can be derived from powers of A
a
b
c
d
e
f
a
0
184
222
177
216
231
b
184
0
45
123
128
200
c
222
45
0
129
121
203
d
177
123
129
0
46
83
e
216
128
121
46
0
83
f
231
200
203
83
83
0
Matrix

Laplacian matrix or "Kirchhoff matrix" or "Admittance matrix"
This is defined as D − A, where D is the diagonal degree matrix. It explicitly
contains both adjacency information and degree information. (However,
there are other, similar matrices that are also called "Laplacian matrices" of
a graph.)
 vi
  = 
−1
  ≠  vi
0
ℎ
1)
2)
3)
4)
5)
6)
7)
8)
Enumeration
Subgraphs, induced subgraphs, and minors
Graph coloring
Route problems
Network flow
Visibility graph problems
Covering problems
Graph classes
Enumeration describes a class of combinatorial enumeration problems in
which one must count undirected or directed graphs of certain types,
typically as a function of the number of vertices of the graph.
Application:
 Enumeration of molecules has been studied for over a century and
continues to be an active area of research.
 The typical approach to enumerating chemical structures has been based on
constructive assembly.
It is list of all free trees
on 2,3,4 labeled vertices:
tree with 2 vertices,
trees with 3 vertices,
trees with 4 vertices.
2.1. Subgraphs:
A common problem, called the subgraph isomorphism problem, is finding a fixed
graph as a subgraph in a given graph. The subgraph isomorphism problem is a
computational task in which two graphs G and Q are given as input, and one must
determine whether G contains a subgraph that is isomorphic to Q. Subgraph
isomorphism is a generalization of both the maximum clique problem and the problem
of testing whether a graph contains a Hamiltonian cycle, and is therefore NP-complete.
 clique problem: Finding the largest complete graph is called the clique problem.
The term "clique" and the problem of algorithmically listing cliques both come from
the social sciences, where complete subgraphs are used to model social cliques,
groups of people who all know each other.
In computer science, the clique problem refers to any
of the problems related to finding particular complete
subgraphs ("cliques") in a graph, i.e.,
sets of elements where each pair of elements is connected.
2.2 Induced subgraphs:
some important graph properties are hereditary with respect to induced
subgraphs, which means that a graph has a property if and only if all induced
subgraphs also have it. Finding maximal induced subgraphs of a certain kind
is also often NP-complete.
 Finding the largest edgeless induced subgraph, or independent set, called
the independent set problem
 An independent set or stable set is a set of vertices in a graph, no two of
which are adjacent. That is, it is a set I of vertices such that for every two
vertices in I, there is no edge connecting the two. The size of an
independent set is the number of vertices it contains.
The graph of the cube has
six different maximal independent sets,
shown as the red vertices.
2.3. Minors:
The minor containment problem, is to find a fixed graph as a minor of a
given graph. A minor or subcontraction of a graph is any graph obtained by taking a subgraph
and contracting some (or no) edges. Many graph properties are hereditary for minors, which
means that a graph has a property if and only if all minors have it too

A graph is planar if it contains as a minor neither the
complete bipartite graph (Three-cottage problem) nor the
complete graph . Graph can be drawn in such a way that
no edges cross each other. Such a drawing is called a plane
graph or planar embedding of the graph.
 Three-cottage problem: water, gas, and electricity, the (three) utilities problem: Suppose
there are three cottages on a plane and each needs to be connected to the gas, water, and
electric companies. Using a third dimension or sending any of the connections through
another company or cottage is disallowed. Is there a way to make all nine connections
without any of the lines crossing each other?
The utility graph K3,3
K3,3 drawn with only one crossing.
Many problems have to do with various ways of coloring graphs, for example:
1) The four-color theorem: In mathematics, the four color theorem, or the four color map theorem states
that, given any separation of a plane into contiguous regions, producing a figure called a map, no more
than four colors are required to color the regions of the map so that no two adjacent regions have the
same color.
2) The strong perfect graph theorem: In graph theory, a perfect graph is a graph in which the chromatic
number of every induced subgraph equals the size of the largest clique of that subgraph. Perfect graphs
are the same as the Berge graphs, graphs that have no odd-length induced cycle or induced complement
of an odd cycle.
The chromatic polynomial counts the number of ways a graph can be colored using no more than a
given number of colors.
The Paley graph of order 9, colored with
three colors and showing a clique of three
vertices. In this graph and each of its
induced subgraphs the chromatic number
equals the clique number,
so it is a perfect graph.
3) The total coloring conjecture (unsolved): In graph theory, total coloring is a type of
coloring on the vertices and edges of a graph. When used without any qualification, a
total coloring is always assumed to be proper in the sense that no adjacent vertices,
no adjacent edges, and no edge and its endvertices are assigned the same color. The
total chromatic number χ″(G) of a graph G is the least number of colors needed in
any total coloring of G.
4) The Erdős–Faber–Lovász conjecture (unsolved)
5) The list coloring conjecture (unsolved)
6) The Hadwiger conjecture (graph theory) (unsolved)
4.1. Hamiltonian path and cycle problems:
Hamiltonian path problem and the Hamiltonian cycle problem are problems of determining whether
a Hamiltonian path or a Hamiltonian cycle exists in a given graph
 Hamiltonian path: a Hamiltonian path is a path in an undirected graph that visits each vertex
exactly once. A Hamiltonian cycle (or Hamiltonian circuit) is a Hamiltonian path that is a cycle
4.2. Minimum spanning tree: In an undirected graph, a spanning tree of that graph is a subgraph that
connects all the vertices together.
4.3. Route inspection problem : route inspection problem is to find a shortest closed path or circuit
that visits every edge of a (connected) undirected graph
4.4. Seven Bridges of Königsberg: The problem was to find a walk through the
city that would cross each bridge once and only once
4.5. Shortest path problem: the shortest path problem is the problem of finding a path between
two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is
minimized.
4.6. Steiner tree: problem in combinatorial optimization, which may be formulated in a number of
settings, with the common part being that it is required to find the shortest interconnect for a given
set of objects
4.7. Three-cottage problem
4.8. Traveling salesman problem : Given a list of cities and their pairwise distances, the task is to
find the shortest possible route that visits each city exactly once and returns to the origin city

Graph pattern matching is often defined in terms of subgraph isomorphism, an NPcomplete problem. To lower its complexity, various extensions of graph simulation
have been considered instead. Given a pattern graph Q and a data graph G, it is to find
all subgraphs of G that match Q.
input images
one-shot matching (26 true)
detected features
gressive matching (159 true)
1.
Isomorphism:
In graph theory, an isomorphism of graphs G and Q is a bijection between the vertex
sets of G and Q ( Q
G)
 =   → ()
such that any two vertices u and v of G are adjacent in G if and only if ƒ(u) and ƒ(v) are
adjacent in Q
 A bijection (or bijective function or one-to-one correspondence) is a function giving
an exact pairing of the elements of two sets.
Graph G
Graph Q
isomorphism
f(a) = 1
f(b) = 6
f(c) = 8
f(d) = 3
f(g) = 5
f(h) = 2
f(i) = 4
f(j) = 7
As observed, it is often too restrictive to catch sensible matches, as it requires matches
to have exactly the same topology as a pattern graph. These hinder its applicability in
emerging applications such as social networks and crime detection.
 Simple Simulation :
denoted by Q ≺ G, S ⊆ VQ × V , where VQ and V are the set of nodes in Q and G,
respectively, such that
1.
for each (u, v) ∈ S, u and v have the same label;
2.
for each node u in Q, there exists v in G such that
a) (u, v) ∈ S,
b) for each edge (u, u’)in Q, there exists an edge (v, v’) in G such that (u’, v’)∈
S. (same children)
G
Q
TE
200
100
300
ST
ST
TE
Book
1
4
2
Book
5
Book
ST
3
TE
TE
1
ST
2,3
Book
4,5
1
ST
2
Book
4
Book
5
ST
3
 Dual simulation:
denoted by Q ≺D G,
1. if Q ≺ G with a binary match relation S ⊆ Vq × V ,
2. for each pair (u, v) ∈ S and each edge (u2, u) in Eq, there exists an edge
(v2, v) in E with (u2, v2) ∈ S. (same children and same parents)
Q
TE
G
200
100
300
ST
ST
TE
Book
1
4
2
Book
5
Book
ST
TE
TE
1
ST
2,3
Book
5
1
ST
2
Book
5
3
ST
3

More Example of Simple and Dual Simulation
G
A
0
100
B
B
0
B
1
B
2
B
3
Simple Simulation
Q
A
A
1
A
0, 8, 9
B
1, 2, 3, 4, 5
200
B
2
A
K
B
A
4
7
B
3
B
Dual Simulation
8
B
A
0
B
1
4
A
B
B
5
A
0, 8, 9
1, 2, 3, 4, 5
9
B
D
8
3
6
A
8
4
B
5

More Example of Simple and Dual Simulation
Q
A
G
10
A
1
C
B
C
30
Simple Simulation
B
3
2
20
4
D
40
D
5
E
9
F
F
50
60
F
F
10
E
11
1
B
2
C
3
D
4, 5
E
7 , 8 , 11
F
6 , 9, 10
D
E
8
A
E
6
7
 Strong simulation:
Define strong simulation by enforcing two conditions on simulation : duality and locality.
Balls. For a node v in a graph G and a non-negative integer r, the ball with center v and radius r is
a subgraph of G, denoted by ˆG[v, r], such that
1.
for all nodes v in ˆG[v, r], the shortest distance dist(v, v) ≤ r,
2.
it has exactly the edges that appear in G over the same node set.
denoted by Q ≺ DL G, if there exist a node v in G and a connected subgraph Gs of G such that
1.
Q ≺D Gs, with the maximum match relation S;
2.
Gs is exactly the match graph w.r.t. S
3.
Gs is contained in the ball ˆG[v, dQ], where dQ is the diameter of Q.
Q
P
P
100
200
G
P
P
P
1
2
3
1
4
P
P
P
P
2
2
3
4
4
P
P
P
P
P
1
2
3
1
3
P
P
4
P
P
1, 2, 3, 4

More Example of Simple and Dual Simulation
G
Q
A
A
0
100
A
0
A
0
0
A
B
B
1
A
2
B
3
A
4
B
5
B
1
B
1
B
1
A
2
A
2
A
2
B
3
B
3
B
3
A
4
A
4
A
4
B
5
B
5
200
B
5
B
6
B
6
A
6
B
6
Example 1:
the Bio has to be recommended by:
a) an HR person;
b) an SE, i.e., the Bio has experience working with SEs;
a) The SE is also recommended by an HR person
c) a data mining specialist (DM), as data mining techniques are required for
the job.
a) there is an artificial intelligence expert (AI) who recommends the DM
and is recommended by a DM.
SE
AI
Bio1
HR1
HR2
Al1
DM1
DM2
Al2
Bio4
HR
Bio2
SE1
SE2
DM
Bio3
Bio
AI1
DM1
AIk
DMk
1
We next present optimization techniques for algorithm Match, by means of
 Query minimization
 Dual simulation filtering
 Connectivity pruning
1. Query minimization: We say that two pattern graphs Q and Q’ are equivalent,
denoted by Q ≡ Q’, if they return the same result on any data graph. A pattern
graph Q is minimum if it has the least size |Q| (the number of nodes and the
number of edges) among all equivalent pattern graphs.
R
A
C1
C2
R
B1
B2
A
B1
D1
D2
C1
D1
2.
Dual simulation filtering. Our second optimization technique aims to avoid
redundant checking of balls in the data graph. Most algorithms of graph
simulation recursively refine the match relation by identifying and removing
false matches. So, we compute the match relation of dual simulation first, and
then project the match relation on each ball to compute strong simulation. This
both reduces the initial match set sim(v) for each node v in Q and reduces the
number of balls . Indeed, if a node v in G does not match any node in Q, then
there is no need to consider the ball centered at v.
 The removal process on a ball only needs to deal with its border nodes and their
affected nodes.
P
P’
Q
P1
P2
P4
G
P3
P1
P3
P4
3.
Connectivity pruning. In a ball, only the connected component
containing the ball center v needs to be considered. Hence, those nodes
not reachable from v can be pruned early.
A1
B1
A2
B2
Q
A1
B1
C
G
A2
B2
def hhk (g: Graph, q: Graph): Unit = {
val sim = HashMap[Int, Set[Int]]()
q.vertices.foreach ( u => {
var lis = Set[Int]()
g.vertices.filter( w => g.label(w) == q.label(u)).foreach ( wp => lis += wp )
sim += u -> lis
})
var flag = true
while (flag) {
flag = false
for (u <- q.vertices; w <- sim(u); v <- q.post(u) if (g.post(w) & sim(v)).isEmpty ) {
sim(u) -= w
flag = true
}
for (u <- q.vertices; w <- sim(u); v <- q.pre(u) if (g.post(w) & sim(v)).isEmpty ) {
sim(u) -= w
flag = true
} //for
} //while
}
For all v € G
If post (v) =0 then
sim(v) = { u € Q | <<u>> = <<v>>}
Else
sim(v) = { u € Q | <<u>> = <<v>> and post (u) ≠ 0}
Remove (v) := pre ( G) – pre (sim(v))
Sim ( D) = { D1,D2}

Remove (v) := pre ( G) – pre (sim(v))
While there is v € G , remove(v) ≠ 0
Remove (D) = {A1,B1,C1,D1,C2,C3} – {C2,C3,A1,B1} = {C1,D1}
for all u € pre(v)
For u -> Pre(D) = { C,A}
for all w € remove (v)
for w -> Remove (D) = {C1,D1}
if w € sim(C) = {C1,C2,C3,A1} => sim (C) = {C1,C2,C3}–{C1}
if w € sim (u)
for all w’ € pre (w) = {A1}
sim (u) = sim (u) – {w}
if post(A1) ᴨ Sim(C) = {C2,C3} ==0 (False)
for all w’ € pre (w)
if post(w’) ᴨ sim (u) = 0 then remove (u) := remove (u) ᴜ {w’}
remove (v) = 0
C2
A1
D2
A
C1
B
C
C3
B1
D1
D
H
G
Home work:
Pattern Q is looking for papers on social networks (SN) cited by papers on
databases (db), which in turn cite papers on graph theory (graph). Fined the
pattern graph and all Isomorphism, Simple simulation, Dual simulation and
strong simulation match graph of that with given graph G
DB1
SN1
DB2
Graph1
DB3
SN2
SN3
Graph2
SN4


The balance constraint:
◦ Balance computational load such that each processor has the same execution
time
◦ Balance storage such that each processor has the same storage demands
Minimum edge cut:
◦ Minimize communication volume between subdomains, along the edges of the
mesh
4-cut
Example 3-way partition
with edge-cut = 9
5-cut
We now define the graph pattern matching problem in a distributed setting. Given pattern
graph Q, and fragmented graph F = (F1, . . ., Fk) of data graph G, in which each fragment
Fi = (G[Vi], Bi) (i ∈ [1, k]) is placed at a separate machine Si, the distributed graph pattern
matching problem is to find the maximum match in G for Q, via graph simulation.
F1 = (G[V1], {BPM1 , BSA1 }), V1 = {PM1, BA1}
BPM1 = {BA1 : 2}, BSA1 = {SD1 : 2},
F2 = (G[V2], ∅),
V2 = {SA1, ST1},
F3 = (G[V3], {BPM2}),
V3 = {PM2,BA2,UD1},
BPM2 = {SA2 : 4} and BSA2 = {SDh : 5},
F4 = (G[V4], {BSA2 }),
V4 = {SA2},
F5 = (G[V5], ∅),
V5 = {SD1, ST1, . . . , SDh, STh},
F1
F3
F2
PM2
PM
SA
BA
SD
F4
PM1
BA1
SA1
SD1
SA2
UD
BA2
UD1
F5
ST
SD1
ST1
SDn
STn
Partial match. A binary relation R ⊆ Vq × Vi is said to be a partial match if
 (1) for each (u, v) ∈ R, u and v have the same label;
 (2) for each edge (u, u’) in Eq,
◦ (a) there exists a node v’ ∈ Bv in Bi having the same label as u’ if v is a boundary node
◦ (b) there exists an edge (v, v’) in G[Vi] such that (u’, v’) ∈ R
Pair (SA, SA1) is in the maximum partial match PM1 in fragment F1 for Q. However, it does not
belong to the maximum match M in G for Q. Consider pattern graph Q1 and data graph G1 , and the
partial match results .
(1) For node SA1, its only child SD1 is located in fragment F2. The partial match SD1 is empty. Hence,
a false match decision is sent back to machine S1, and this further helps determine that (SA,SA1) is a
false match.
(2) For node SA2, its only child SDn is located in fragment F5. The subgraph F5 contains no boundary
nodes, and SDn belongs to F5. Hence, a true match decision is sent back to machine S4, and this further
helps determine that (SA,SA2) is a true match. After these are done, fragment F3 is the only part of G
that needs to be further evaluated. To check the matches in F3, we simply ship fragment F4 to machine
S3.
F3
F4
F1
PM
PM1
BA1
F2
PM2
SA2
SA
BA
UD
BA2
SA1
UD1
SD1
F5
SD
ST
SD1
ST1
SDn
STn

Go for each matched label vertex and create the ball. with d=4 (L=2)
Q
G
A
D
1
5
B
40
2
C
D
3
F
F
E
50
E
6
12
60
D, E, F
D
4
D
7
D
E
8
F
10
F
9
E
11
D
D
5
12
4
E
F
6
D
F
8
E
4
D
9
E
7
F
10
E
11
F
8
9
10
11
F
E
 Correct
highly scalable systems.
 Fault tolerant system that self heals.
 Truly scalable systems.
………. Using state of the art tools.
 ….
Simpler



Concurrency
Scalability
Fault Tolerance

With a single unified
 Programming Model
 Runtime Service

Finance
•
•
Stock trend analysis and simulation.
Event Driven Messaging Systems.

Betting and Gaming
•
•
Massive multiplayer online gaming
High throughput and transactional betting.

Telecom
•
Streaming media network gateways.

Simulation
•
3 D Simulation Engine.

Ecommerce
•
Social Media Community Sites.

In computer science, the Actor model is a
mathematical model of concurrent computation that
treats "actors" as the universal primitives of
concurrent digital computation: in response to a
message that it receives, an actor can make local
decisions, create more actors, send more messages,
and determine how to respond to the next message
received.



AKKA is a toolkit and runtime for building highly concurrent distributed
and fault tolerant even driven application on the JVM
Parallism
Concurrency
Event Driven
Actor
Behavior
State
class object Tick
class Counter extends Actors {
Var counter =0
Def receive ={
Case tick =>
Counter += 1
Println (counter)
}
}
Val counter = actorOf[Counter]
Counter is an ActorRef
Counter ! tick
val future=actor !!! Message
future.await
val result = future.result
Class SomeActor extends Actor {
def receive = {
Case User(name) =>
Self.reply("Hi" + name)
}
}
Self become{
Case NewMessage =>
…….
}
There are four different types of message dispatchers:
1. Thread-based
2. Event-based
3. Priority event-based
4. Work-stealing

The ‘ThreadBasedDispatcher’ binds a dedicated OS thread to
each specific Actor. The messages are posted to a
‘LinkedBlockingQueue’ which feeds the messages to the
dispatcher one by one. A ‘ThreadBasedDispatcher’ cannot be
shared between actors. This dispatcher has worse
performance and scalability than the event-based dispatcher
but works great for creating “daemon” Actors that consumes
a low frequency of messages and are allowed to go off and do
their own thing for a longer period of time. Another
advantage with this dispatcher is that Actors do not block
threads for each other.


The ‘ExecutorBasedEventDrivenDispatcher’ binds a set of
Actors to a thread pool backed up by a ‘BlockingQueue’. This
dispatcher is highly configurable and supports a fluent
configuration API to configure the ‘BlockingQueue’ (type of
queue, max items etc.) as well as the thread pool.
The event-driven dispatchers must be shared between
multiple Actors. One best practice is to let each top-level
Actor, e.g. the Actors you define in the declarative supervisor
config, to get their own dispatcher but reuse the dispatcher for
each new Actor that the top-level Actor creates. But you can
also share dispatcher between multiple top-level Actors.
import akka.actor.Actor
import akka.dispatch.Dispatchers
class MyActor extends Actor {
self.dispatcher =
Dispatchers.newExecutorBasedEventDrivenDispatcher(name)
.withNewThreadPoolWithLinkedBlockingQueueWithCapacity(100
)
.setCorePoolSize(16)
.setMaxPoolSize(128)
.setKeepAliveTimeInMillis(60000)
.build
...
}

It’s useful to be able to specify priority order of
messages, that is done by using
PriorityExecutorBasedEventDrivenDispatcher.
import akka.dispatch._
import akka.actor._
val gen = PriorityGenerator { // Create a new PriorityGenerator, lower prio means more
important
case 'highpriority => 0 // 'highpriority messages should be treated first if possible
case 'lowpriority => 100 // 'lowpriority messages should be treated last if possible
case otherwise => 50 // We default to 50
}
val a = Actor.actorOf( // We create a new Actor that just prints out what it processes
new Actor {
def receive = {
case x => println(x)
}
})
// We create a new Priority dispatcher and seed it with the priority generator
a.dispatcher = new PriorityExecutorBasedEventDrivenDispatcher("foo", gen)
a.start // Start the Actor

The‘ExecutorBasedEventDrivenWorkStealingDispatcher’ is a variation of
the ‘ExecutorBasedEventDrivenDispatcher’ in which Actors of the same
type can be set up to share this dispatcher and during execution time the
different actors will steal messages from other actors if they have less
messages to process. This can be a great way to improve throughput at the
cost of a little higher latency.

Scratch Data

Static Data
•
•
Supplied At boot time.
Supplied by other components.

Dynamic Data
•
Data possible to recompute.
Data from other sources.
•

Akka is a implementation of Actor Model for both
java and scala.

Actor encapsulates mutable state with the guarantee
of one message at a time.
Assign
each child
the label
matched
ball
Union All
Matches










Graph theory http://en.wikipedia.org/wiki/Graph_theory
Capturing Topology in Graph Pattern Matching
http://vldb.org/pvldb/vol5/p310_shuaima_vldb2012.pdf
Making a Move of Graphs via Probabilistic Voting
http://cv.snu.ac.kr/publication/conf/2012/ProgGM_CVPR2012.pdf
GPS: A Graph Processing System http://ilpubs.stanford.edu:8090/1039/5/full_paper.pdf
Distributed Graph Pattern Matching
http://www2012.wwwconference.org/proceedings/proceedings/p949.pdf
Pregel: A System for Large-Scale Graph Processing
new-chinese-chess-engine.googlecode.com/svnhistory/r21/trunk/search_engine/doc/arch/arch/pregel_paper.pdf
Akka 2.0: Scaling Up & Out With Actors
https://www.youtube.com/watch?v=3jbqTxstlC4&feature=relmfu
Apache Giraph: distributed graph processing in the cloud
https://www.youtube.com/watch?feature=endscreen&v=BmRaejKGeDM&NR=1
MapReduce Used on Large Data Sets https://www.youtube.com/watch?v=N8FHXgPJEfQ



The Actor model adopts the philosophy that everything is an actor. This is similar to the
everything is an object philosophy used by some object-oriented programming languages, but
differs in that object-oriented software is typically executed sequentially, while the Actor
model is inherently concurrent.
The Actor Model instead of manually creating threads or event loops, creates an object that
has state and associated logic, and this associated logic will only be called with one thread at
the time and communicate with outside through messages.
An actor is a computational entity that, in response to a message it receives, can concurrently:
 send a finite number of messages to other actors;
 create a finite number of new actors;
 designate the behavior to be used for the next message it receives.

Recipients of messages are identified by address, sometimes called "mailing address". Thus
an actor can only communicate with actors whose addresses it has. It can obtain those from a
message it receives, or if the address is for an actor it has itself created.


Create
Case object Tick
Class Counter extends Actor{
var counter = 0
Def receive = {
Case Tick => counter +=1
Println(counter)
}
}
Create an instance of counter Actor in system and give you the reference handle back
var counter = system.actorOf ( Props [ Counter ] . name = “ conunt”)
or if we are inside of parent and want to create child:
var counter = Context.actorOf(….)

To stop
Counter.stop
Define how to
create an actor
Name of the actor
in the hierarchy



Send Message
Counter ! Tick
(send Tick message to counter method -> put it in mail box)
Reply
class SomeActor extends Actor{
def receive = {
case User(name) => sender ! (“Hi” + name)
}
}
To Change behaviour
self become{
case NewMessage => …..
}

Failure Strategy
class MySupperVision extends Actor{
def supervisionStratogy = OneForOneStratogy({
case_ : ActorKilledException => Stop
case_ : ArithmaticException => Resume
case_ : Exception => Restart
}, maxNrOfRetries = None , with in Time Range = None)
def recive = {
case NewUser(name) =>
….. = context.actorOf[User] (name)

Remoting
remote actors have a different kind of ActorRef
akka{
actor{
provider = akka.remote.RemoteActorRefProvider
deployment{
counter{
remote = akka:[email protected]:255
}
}
}
}




https://www.youtube.com/watch?v=3jbqTxstlC4&feature=relmfu
https://www.youtube.com/watch?feature=endscreen&v=BmRaejKGeDM&
NR=1
https://www.youtube.com/watch?v=UY3fuHebRMI
https://www.youtube.com/watch?v=N8FHXgPJEfQ

similar documents