### Visualization of Graph Data

```Visualization of Graph Data
CS 4390/5390 Data Visualization
Shirley Moore, Instructor
October 6, 2014
1
Graphs and Trees
•
•
•
•
•
•
•
•
•
•
graph – a set of nodes (vertices) connected by links (edges)
Links can be directed or undirected.
Two edges are adjacent if they share a common node.
Nodes and links can both have attributes.
A path from node a to node b is a sequence of adjacent edges from
a to b.
A cycle is a path that begins and ends at the same node.
A graph is connected if there exists a path between any two nodes.
A tree is a connected acyclic graph.
If there are n nodes, what is the maximum number of links
–
–
–
–
in a directed graph?
in an undirected graph?
in a directed tree?
in an undirected tree?
2
Graph Analytics
Slide courtesy of John Feo, PNNL
3
Scientific Grids vs. Data Informatics Graphs
Slide courtesy of John Feo, PNNL
4
Slide courtesy of Mathieu Bastian
5
National Security
Slide courtesy of Mathieu Bastian
6
Public Health
7
Small Graphs
8
Medium Graphs
9
Large Graphs
• http://snap.stanford.edu/data/index.html
10
Implicit vs. Explicit
11
Graph Analytics
12
Idiom Choices
13
• What: Tree dataset
• Why: Hierarchical relationships, topology analysis tasks
• How: Vertical spatial position shows depth in tree, horizontal
spatial position is artifact of layout algorithm
• Scale: A few dozen nodes
14
• What: tree dataset
• How:
– Depth encoded as distance from center of circle
– Links drawn as smoothly curving splines
– Reingold-Tilford layout algorithm
• Scale: A few hundred nodes
• Example written in D3:
– http://bl.ocks.org/mbostock/4063550
15
D3 Tree Layout
• http://www.d3noob.org/2014/01/tree-diagrams-ind3js_11.html
• https://github.com/mbostock/d3/wiki/TreeLayout
• Representative of the D3 hierarchy layout
– https://github.com/mbostock/d3/wiki/HierarchyLayout
• Produces node-link diagrams of trees using the
Reingold-Tilford “tidy” algorithm
• Can input data that is in JSON (JavaScript Object
Notation) format
16
Brainstorming Exercise 1
• How could we scale tree layouts to more than
a few hundred nodes?
– Possible strategy: use 3D
• Why or why not?
17
Collapsible Tree Layout
• Example in D3
– http://bl.ocks.org/mbostock/4339083
18
Treemap
Examples:
• http://bl.ocks.org/mbostock/4063582
• http://bost.ocks.org/mike/treemap/
19
General Graph Layouts
• Also called network layouts
• Do not directly use spatial position to encode
attribute values
• Layout algorithms try to minimize number of
edge crossings and node overlaps.
• May use size and color encodings for node and
20
Force-Directed Placement
• Widely used for node-link network layout
• Position network elements according to a simulation of
physical forces – e.g.,
– Nodes push away from each other
– Links act like springs that draw their endpoints closer
• Can start by placing nodes randomly and iterating to
–
–
–
–
Clusters may be artifacts of algorithm
Layout may be nondeterministic
May get stuck in local minimum energy configuration
Doesn’t scale past a few hundred nodes
21
What-Why-How for
Force-Directed Placement
22
Scalable Force-Directed Placement
(sfdp)
• Multilevel approach that transforms network into hierarchy of
successively simpler networks
• Algorithm: Layout coarsest network first, then improve layout
with more and more complex versions
• Examples: http://yifanhu.net/GALLERY/GRAPHS/index.html
• Graphviz software: http://www.graphviz.org/
23
Example: http://bost.ocks.org/mike/miserables/
24
and Matrix Views
25
Brainstorming Exercise 2
• Which graph analysis tasks are better
supported by the node-link view, and which
are better supported by the matrix view?
• How does the above answer change with
increasing size of the graph?
26
Graph Visualization Tools
• Sigma.js JavaScript library
• Gephi open source graph viz platform
• Many more!
27
Preparation for Next Class
• Keep working with D3, use the tutorials on the
D3.js wiki
• Implement interaction in your parallel
coordinates visualization for Lab 3
• Decide which datasets to use for Lab 3