abcdef 1 0 1 1 0 0 0 2 0 0 1 1 0 0 3 0 0 0 1 1 1 4 0 0 0 1 0 1 5 1 0 0 1

Report
Persons Through Groups
2-mode networks
Overview
Breiger: Duality of Persons and Groups
•Argument
•Method
•Sociology Examples
•Moody: Coauthorship
•Methods:
•Finish ego-networks
•Working w. 2-mode data
•Constructing a PTG network
•Constructing a GTP network
•(Bipartite graphs)
Persons Through Groups
2-mode networks
Breiger: 1974 - Duality of Persons and Groups
Argument:
Metaphor: people intersect through their
associations, which defines (in part) their
individuality.
Duality implies that relations among
groups implies relations among
individuals
Persons Through Groups
2-mode networks
An Example:
Interpersonal Network
C
B
Intergroup Network
E
D
3
F
1
4
A
0
0
0
1
0
0
(4.3)
0 0 1 0
0 1 0 0
1 0 1 0
0 1 0 1
0 0 1 0
0 0 2 1
5
2
0
0
0
2
1
0
0
1
0
0
0
(4.4)
1 0 0
0 1 1
1 0 2
1 2 0
1 1 1
0
1
1
1
0
Problem:
These two
representations,
though clearly
related, are not
easily compared.
Persons Through Groups
2-mode networks
An Example:
To compare them, construct a person-to-group adjacency matrix:
A=
A
B
C
D
E
F
1
0
1
1
0
0
0
2
0
0
1
1
0
0
3
0
0
0
1
1
1
4
0
0
0
1
0
1
5
1
0
0
1
0
0
Each column is a group,
each row a person, and
the cell = 1 if the person in
that row belongs to that
group.
You can tell how many
groups two people both
belong to by comparing
the rows: Identify every
place that both rows = 1,
sum them, and you have
the overlap.
Persons Through Groups
2-mode networks
An Example:
Compare persons A and F
A=
A
B
C
D
E
F
1
0
1
1
0
0
0
2
0
0
1
1
0
0
3
0
0
0
1
1
1
4
0
0
0
1
0
1
5
1
0
0
1
0
0
1
A 0
F 0
AF 0
2
0
0
0
3
0
1
0
4
0
1
0
5
1
0
0
S
= 1
= 2
= 0
Or persons D and F
1
D 0
F 0
DF 0
2
1
0
0
3
1
1
1
4
1
1
1
5
1
0
0
S
= 4
= 4
= 2
Person A is in 1
group, Person F is
in two groups,
and they are in no
groups together.
Person D is in 4
groups, Person F
is in two groups,
and they are in 2
groups together.
Persons Through Groups
2-mode networks
An Example:
A=
A
B
C
D
E
F
1
0
1
1
0
0
0
2
0
0
1
1
0
0
3
0
0
0
1
1
1
4
0
0
0
1
0
1
5
1
0
0
1
0
0
Similarly for Groups:
1 2 12
A 0 0 0
B 1 0 0
C 1 1 1
D 0 1 0
E 0 0 0
F 0 0 0
 2 2 1
•
Group 1 has 2
members,
group 2 has 2
members and
they overlap
by 1 members
(C).
Persons Through Groups
2-mode networks
In general, you can get the overlap for any pair of
groups / persons by summing the multiplied
elements of the corresponding rows/columns of the
persons-to-groups adjacency matrix. That is:
Persons-to-Persons
g
Pij   Aik A jk
k 1
Groups-to-Groups
p
Gij   Aki Akj
k 1
Persons Through Groups
2-mode networks
One can get these easily with a little matrix multiplication.
First define AT as the transpose of A (Simply reverse the
rows and columns). If A is of size P x G, then AT will be of
size G x P.
A  Aji
T
ij
A=
A
B
C
D
E
F
1
0
1
1
0
0
0
2
0
0
1
1
0
0
3
0
0
0
1
1
1
4
0
0
0
1
0
1
5
1
0
0
1
0
0
AT =
1
2
3
4
5
A
0
0
0
0
1
B
1
0
0
0
0
C
1
1
0
0
0
D
0
1
1
1
1
E
0
0
1
0
0
F
0
0
1
1
0
Persons Through Groups
2-mode networks
A
B
A= C
D
E
F
= A(AT)
P
G = AT(A)
1
0
1
1
0
0
0
2
0
0
1
1
0
0
3
0
0
0
1
1
1
4
0
0
0
1
0
1
5
1
0
0
1
0
0
1
2
AT = 3
4
5
A
0
0
0
0
1
A
B
C
D
E
F
A
1
0
0
1
0
0
B
0
1
1
0
0
0
P
C
0
1
2
1
0
0
= P
(6x6)
D
1
0
1
4
1
2
E
0
0
0
1
1
1
F
0
0
0
2
1
2
See: Breiger_ex.sas for an IML example.
C
1
1
0
0
0
D
0
1
1
1
1
E
0
0
1
0
0
F
0
0
1
1
0
(5x6)
(6x5)
A * AT
(6x5)(5x6)
B
1
0
0
0
0
AT * A
= P
(5x6) 6x5) (5x5)
1
2
3
4
5
1
2
1
0
0
0
2
1
2
1
1
1
G
3
0
1
3
2
1
4
0
1
2
2
1
5
0
1
1
1
2
Persons Through Groups
2-mode networks
Theoretically, these two equations define what Breiger means by duality:
“With respect to the membership network,…, persons who are
actors in one picture (the P matrix) are with equal legitimacy viewed as
connections in the dual picture (the G matrix), and conversely for groups.”
(p.87)
The resulting network:
1) Is always symmetric
2) the diagonal tells you how many groups (persons) a person
(group) belongs to (has)
In practice, most network software (UCINET, PAJEK) will do all of these
operations. It is also simple to do the matrix multiplication in programs
like SAS or SPSS
Name
Alessandro Tarozzi
Alexander Pfaff-Talikoff
Amar Hamoudi
Anatoli Yashin
Angela M ORand
Anna Gassman-Pines
Asia Maselko
Avshalom Caspi
Charlie Cloffelter
Christina M. Gibson-Davis
Duncan Thomas
Elizabeth Frankenberg
Elizabeth Oltmans Ananat
Frank A. Sloan
Jacob L. Vigdor
James Moody
James S Clark
James W. Vaupel
Jennan Read
Jerry Reiter
Kim Blankenship
Kathleen Sikkema
Keith E Whitfield
Kenneth A Dodge
Kenneth C Land
Linda K George
Linda M Burton
Lisa A Keister
M. Giovanna Merli
Manoj Mohanan
Marie Lynn Miranda
Marjorie B McElroy
P. J. Eric Stallard
Patrick Bayer
Peter Arcidiacono
Phil Morgan
Philip J. Cook
Philip R Costanzo
Rachel Kranton
Sabrendu Pattanayak
Seth Gary Sanders
Sherman James
Terrie E Moffitt
V. Joseph Hotz
William \"Sandy\" Darity
Zeng Yi
Health Fam
0
0
1
0
1
1
1
0
1
0
0
1
1
0
1
0
0
0
0
1
1
1
1
1
0
1
1
0
0
0
0
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
1
1
0
1
0
1
1
0
0
1
0
1
0
1
0
0
1
1
0
0
0
0
1
0
1
0
0
1
0
0
0
1
0
1
1
1
0
0
0
0
1
0
0
1
1
Devlp
1
0
0
0
0
1
0
1
0
1
1
0
0
0
1
1
0
0
0
0
0
1
0
1
1
0
1
0
0
0
1
0
0
0
0
0
0
1
0
0
1
0
1
0
0
0
Ineq
1
0
0
1
0
1
0
0
1
1
1
1
1
0
1
1
0
0
1
0
0
0
0
1
1
0
1
1
0
0
0
0
0
1
1
0
1
0
1
0
1
1
0
1
1
0
Persons Through Groups
DuPRI Example
=A
G=(AT)A
Health Fam HDev Ineqy
29 7
9
9
7 14
6
10
9 6
15
10
9 10
10
23
Area Overlap Among
DuPRI Faculty
P
Human Dev
= A(AT)
(Inequality)
Health
Family
Persons Through Groups
2-mode networks
Online Version
Persons Through Groups
Sociology Example
Or consider ties formed by sharing membership on a student committee (MA, exams, etc).
(all committee memberships, line thickness proportional to number of joint appearances)
Persons Through Groups
Sociology Example
Or consider ties formed by sharing membership on a student committee (MA, exams, etc).
Duke English Department
(all committee memberships, line thickness proportional to number of joint appearances)
Persons Through Groups
Sociology Example
Or consider ties formed by sharing membership on a student committee (MA, exams, etc).
Duke English Department
Interactive
version
(all committee memberships, line thickness proportional to number of joint appearances)
Persons Through Groups
Sociology Coauthorship
Sociology Coauthorship Networks
Persons Through Groups
Sociology Coauthorship
(2-mode)
(1-mode
projection)
Persons Through Groups
Sociology Coauthorship
LSL reaches 533 people in 3 steps.
3-degrees of Lynn Smith-Lovin
Persons Through Groups
Sociology Coauthorship
3-degrees of LSL
Persons Through Groups
Sociology Coauthorship
The likelihood of coauthorship varies by type of work
Persons Through Groups
Sociology Coauthorship
Persons Through Groups
Sociology Coauthorship
Largest Bicomponent, g = 29,462
0.04
0.27
0.50
0.73
0.96
Persons Through Groups
Sociology Coauthorship
Largest Bicomponent, n = 29,462
Persons Through Groups
Director Interlocks
Val Burris – Interlocks & Political Cohesion
Persons Through Groups
Director Interlocks
Val Burris – Interlocks & Political Cohesion
Persons Through Groups
Director Interlocks
Val Burris –
Interlocks &
Political Cohesion
Persons Through Groups
Director Interlocks
Val Burris – Interlocks & Political
Cohesion
Effect size of indirect ties, by Dependent Variable
0.16
0.14
0.12
Party Contribution
0.1
Presidential Match
0.08
0.06
Presidential Correlation
0.04
0.02
0
Direct
2-ste
3-step
4-step
5-step
6-step
Persons Through Groups
Ecology Co-authorship
Persons Through Groups
Ecology Co-authorship
Persons Through Groups
Ecology Co-authorship
Persons Through Groups
Physician Networks
Construct networks
of physicians who
share patients. Note
we sampled patients
from 5 states, here
are resulting
physicians from all
the PA patients.
Table 1. Network Sample Construction
Patient
Visits
2008 12,263,448
2009 12,977,008
2010 12,167,013
Total 37,407,477
Unique
Patients
922,189
924,387
871,993
963,899
Unique
Physicians
138,375
134,863
135,212
190,785
Persons Through Groups
Constructing large 2-mode nets
•
The direct matrix multiplication approach is (highly) inefficient for large 2-mode networks.
-couldn’t even hold the physician indicator matrix in memory
Solution is to construct the bipartite list then construct edges as a summary over that. For example:
Obs
1
2
3
4
5
6
7
8
9
10
rid
1
2
2
2
3
3
4
4
5
5
auid
60242
1961
16006
47741
50009
51417
30417
49612
8396
11500
In SAS, I then transpose the matrix by the mode I want to link by. So
here, if I want an author to author network, I transpose by papers (Rid)
Persons Through Groups
Constructing large 2-mode nets
•
The direct matrix multiplication approach is (highly) inefficient for large 2-mode networks.
-couldn’t even hold the physician indicator matrix in memory
Then write a loop to construct the edge-parts
data edges;
set auplev;
array aus(82) col1-col82;
do i=1 to 81;
if aus(i)^= . then do;
snd=aus(i);
end;
else do;
i=82;
end;
do j=i+1 to 82;
if aus(j)^= . then do;
snd=min(aus(i),aus(j);
rcv=max(aus(i),aus(j);
val=1;
output;
end;
else do;
j=82;
end;
end;
end;
keep snd rcv rid val;
run;
This produces all the edge parts, then sum by dyad to get the
valued network.
proc means data=edges noprint;
class snd rcv;
var val;
output out=edgesum (where=(_type_=3)) sum=;
run;
Persons Through Groups
Constructing large 2-mode nets
In PAJEK, you can define an input graph as bipartite as:
*Vertices 8 3
1 "Actor 1"
2 "Actor 2"
3 "Actor 3"
4 "Event 1"
5 "Event 2"
6 "Event 3"
7 "Event 4"
8 "Event 5"
*Edges
14
15
24
25
26
28
34
37
38
So the first line has two vertices numbers, the total number of
nodes (8) and the number in the first “row” mode (3). Then the
edges all fall from mode 1 to mode 2.
Persons Through Groups
Bipartite “Two-Mode” graphs
It is possible to construct a network that links people and their groups directly
in a single network. In this case, the nodes are of 2 types: person and groups.
Consider the classic example of the Southern Women’s data:
Persons Through Groups
Bipartite “Two-Mode” graphs
The classic treatment of this network would create a person to person or a group to
group network:
Persons Through Groups
Bipartite “Two-Mode” graphs
The classic treatment of this network would create a person to person or a group to
group network:
Persons Through Groups
Bipartite “Two-Mode” graphs
Instead, you could analyze the network as a joint network, with two types of nodes:
Persons Through Groups
Bipartite “Two-Mode” graphs
Instead, you could analyze the network as a joint network, with two types of nodes:
Persons Through Groups
Bipartite “Two-Mode” graphs
1 2 3 4 5 6 7 8
---------------------------Actor 1 1. 0 0 0 1 1 0 0 0
Actor 2 2. 0 0 0 1 1 1 0 1
Actor 3 3. 0 0 0 1 0 0 1 1
Event 1 4. 1 1 1 0 0 0 0 0
Event 2 5. 1 1 0 0 0 0 0 0
Event 3 6. 0 1 0 0 0 0 0 0
Event 4 7. 0 0 1 0 0 0 0 0
Event 5 8. 0 1 1 0 0 0 0 0
It is always possible to arrange a 2mode network so that the adjacency
matrix has all zeros in the blockdiagonal cells.
Persons Through Groups
Bipartite “Two-Mode” graphs
Galois Lattices
A new way to think about bipartite networks is as a collection of ordered sets, and then
use some of the tools from discrete mathematics to map the collection of sets. For
example, consider the set of all possible combinations of {1,2,3}. This can be
represented in a network as:
This is known as a
Galois Lattice
Persons Through Groups
Bipartite “Two-Mode” graphs
Galois Lattices
Imagine you had the following data on actors and events:
Persons Through Groups
Bipartite “Two-Mode” graphs
Galois Lattices
Persons Through Groups
Bipartite “Two-Mode” graphs
Galois Lattices
The Davis data in Lattice form:
Topic / Text Models
To uncover topics, we applying a similar process across papers and words. Basically a corpus
is nothing more than a big two-mode network of papers containing words:
Paper
1
Paper
2
Paper 3
Paper 4
Obedient
5
10
0
0
Loyal
6
5
1
0
Friendly
8
9
0
0
Aloof
0
1
9
15
Proud
0
0
5
4
Dog
2
1
0
0
Cat
0
0
1
1
Comparing across columns tells
us whether the two papers are
recognized by others as similar.
similarity matrix
Paper
1
Paper
2
Paper
3
Paper
4
Paper 1
--
Hi
low
Low
Paper 2
Hi
--
Low
Low
Paper 3
low
Low
--
Hi
Paper 3
low
low
Hi
--
Topic / Text Models
Key differences are:
a) we typically need to parse the text first for unimportant words, parts of speech or other
particular features we care about.
b) Weight words differently based on their importance in the corpus
-Most common is the td-idf formulation, that gives higher weight to rare words
c) Then define a similarity score rather than a simple count/volume of overlap
Topic / Text Models
Topic / Text Models
Term “key”
result
Topic / Text Models
Tgparse
linked
output:
Weighting applied by tmutil
These are all “under the hood” in the SAS “TextMiner” application
(linked)
Background
Mining Science Products: Topic structure
To uncover topics, we applying a similar process across papers:
Example: One-step neighborhood of “More information, better jobs?”
Background
Mining Science Products: Topic structure
To uncover topics, we applying a similar process across papers:
Example: One-step neighborhood of “More information, better jobs?”
Background
Mining Science Products: Topic structure
To uncover topics, we applying a similar process across papers:
Background
Mining Science Products: Topic structure
Network Ecology Topic Map
Borrett, Stuart R., James Moody & Achim Edelmann. 2014. “The Rise of Network Ecology: Maps of the topic diversity and scientific
collaboration” Ecological Modeling (DOI: 10.1016/j.ecolmodel.2014.02.019)
Man Made Pathogen Debate
Community of Science Foundations
Topic Structures
The collaboration space is based on published papers and we’re curious how the papers
are topically clustered.
Here we used the Latent Dirichlet allocation (LDA) topic modeling routine on the full
corpus of papers. LDA does not assign papers to topics exactly, but rather provides a
degree of association based on the topic loadings depending on the paper’s distribution of
terms.
Community of Science Foundations
Topic Structures
We settled on an eight topic solution:
Paper similarity matrix, sorted by topic loadings
Community of Science Foundations
Topic Structures
Papers titles of papers with the top five topic loadings on each topic
Title: Virology (emphasis on Influenza)
top1
Growth of H5N1 influenza a viruses in the upper respiratory tracts of mice
0.99363
Transmission of Influenza Virus in a Mammalian Host Is Increased by PB2 Amino Acids 627K or 627E/701N
0.99274
The M Segment of the 2009 New Pandemic H1N1 Influenza Virus Is Critical for Its High Transmission Efficiency in the
0.99236
Guinea Pig Model
Insertion of a multibasic cleavage site in the haemagglutinin of human influenza H3N2 virus does not increase pathogenicity in 0.99137
ferrets
Reverse genetics demonstrates that proteolytic processing of the Ebola virus glycoprotein is not essential for replication in cell 0.99127
culture.
Title: Evolutionary Genetics
top2
Identifying Sigtures of Selection in Genetic Time Series
0.99457
A spatially explicit model of sex ratio evolution in response to sex-biased dispersal
0.99441
The magnitude of local adaptation under genotype-dependent dispersal
0.99431
The advantages of segregation and the evolution of sex.
0.99427
DISENTANGLING THE EFFECTS OF EVOLUTIORY, DEMOGRAPHIC, AND ENVIRONMENTAL FACTORS
INFLUENCING GENETIC STRUCTURE OF TURAL POPULATIONS: ATLANTIC HERRING AS A CASE STUDY
0.99414
Title: Genetic Sequencing
Sequence and organization of coelacanth neurohypophysial hormone genes: evolutiory history of the vertebrate
neurohypophysial hormone gene locus
Characterization of the neurohypophysial hormone gene loci in elephant shark and the Japanese lamprey: origin of the
vertebrate neurohypophysial hormone genes
Sequence Data from New Plastid and Nuclear COSII Regions Resolves Early Diverging Lineages in Coffea (Rubiaceae)
top3
0.99445
Sequence characterization and comparative alysis of three Plasmids isolated from environmental Vibfio spp.
0.99267
Large Linear Plasmids of Borrelia Species That Cause Relapsing Fever
0.99244
0.99357
0.99308
Community of Science Foundations
Topic Structures
Papers titles of papers with the top five topic loadings on each topic
Title: Immunology
Cholinergic agonists regulate JAK2/STAT3 sigling to suppress endothelial cell activation
top4
0.99368
CD4 expression on activated NK cells: Ligation of CD4 induces cytokine expression and cell migration
0.99295
Reduced DEAF1 function during type 1 diabetes inhibits translation in lymph node stromal cells by suppressing Eif4g3
0.99281
Persistent expression of Pax3 in the neural crest causes cleft palate and defective osteogenesis in mice
0.99259
Critical Role of the Tumor Suppressor Tuberous Sclerosis Complex 1 in Dendritic Cell Activation of CD4 T Cells by
Promoting MHC Class II Expression via IRF4 and CIITA
0.99252
Title: Public Health (emphasis on HIV)
Opportunities for health promotion education in child care.
To Fund or Not to Fund Development of a Decision-Making Framework for the Coverage of New Health Technologies
top5
0.99639
0.99515
Community-based research in AIDS-service organizations: what helps and what doesn't?
0.99445
Sustaining chronic disease magement in primary care: Lessons from a demonstration project
0.99418
Strengthening biostatistics resources in sub-Saharan Africa: Research collaborations through U.S. partnerships
0.99418
Title: Biochemistry (cellular)
Functiol and structural roles of the N-termil extension in Methanosarci acetivorans protoglobin
top6
0.99315
The effects of an ideal beta-turn on beta-2 microglobulin fold stability
The Juxtamembrane Linker of Full-length Syptotagmin 1 Controls Oligomerization and Calcium-dependent Membrane
Binding.
Structure, conformatiol stability, and enzymatic properties of acylphosphatase from the hyperthermophile Sulfolobus
solfataricus.
The Escherichia coli Lpt Transenvelope Protein Complex for Lipopolysaccharide Export Is Assembled via Conserved
Structurally Homologous Domains
0.99288
0.99267
0.99236
0.99220
Community of Science Foundations
Topic Structures
Papers titles of papers with the top five topic loadings on each topic
Title: HIV Vaccines & Drugs
Efficacy of zidovudine compared to stavudine, both in combition with lamivudine and indivir, in human immunodeficiency
virus-infected nucleoside-experienced patients with no prior exposure to lamivudine, stavudine, or protease inhibitors (novavir
trial).
Stavudine, nevirapine and ritovir in stable antiretroviral therapy-experienced children with human immunodeficiency virus
infection.
Effect of HIV Infection Status and Anti-Retroviral Treatment on Quantitative and Qualitative Antibody Responses to
Pneumococcal Conjugate Vaccine in Infants
Prior meningococcal A/C polysaccharide vaccine does not reduce immune responses to conjugate vaccine in young adults.
Long-Term Efficacy and Safety of Raltegravir Combined with Optimized Background Therapy in Treatment-Experienced
Patients with Drug-ResistantHIV Infection: Week 96 Results of the BENCHMRK 1 and 2 Phase III Trials
top7
0.99418
Title: Social Aspects of Health Care
Impact of admission hyperglycemia on hospital mortality in various intensive care unit populations.
top8
0.99572
High prevalence of chronic kidney disease in population-based patients diagnosed with type 2 diabetes in downtown Shanghai
0.99553
Does socioeconomic status affect mortality subsequent to hospital admission for community acquired pneumonia among older
persons?
0.99488
F-18-FDG PET/CT Identifies Patients at Risk for Future Vascular Events in an Otherwise Asymptomatic Cohort with
Neoplastic Disease
Preinjury warfarin use among elderly patients with closed head injuries in a trauma center.
0.99461
0.99302
0.98871
0.98816
0.98403
0.99457
Community of Science Foundations
Topic Structures
Red-blue scale is size, circle is proportional to distribution in 2d space
Community of Science Foundations
Topic Structures
Red-blue scale is size, circle is proportional to distribution in 2d space
Community of Science Foundations
Topics predict Debate Side
Assign each node to the area they write in most:
“Virology
Influenza”
Evolutionary
Genetic
Immunology
Genetics
Sequencing
Public
Health
Cellular
BioChem
HIV/Drugs
Social
Aspects of
health
Extending Text beyond “bag of words”
Key issue with text models is that they “chop up” language – subtle differences get lost:
“country music problem”
Solutions:
• link words (k-word phrases). This adds in a little localized context
• sentiment models: add a content-specific weight to each term, based on prior
knowledge
• Implication models. Goal here is to link terms/concepts to each other by the narrative
implication implied in the sentence/corpus.
Extending Text beyond “bag of words”
Blocking the Future
Bearman, Peter S., Robert Farris, and James Moody. “Blocking the Future: New Solutions for Old Problems in Historical Social Science.”
Social Science History 23: 501-535.
Extending Text beyond “bag of words”
Blocking the Future
One villager's life story
Bearman, Peter S., Robert Farris, and James Moody. “Blocking the Future: New Solutions for Old Problems in Historical Social Science.”
Social Science History 23: 501-535.
Extending Text beyond “bag of words”
Blocking the Future
Combined narratives from multiple interviews
Bearman, Peter S., Robert Farris, and James Moody. “Blocking the Future: New Solutions for Old Problems in Historical Social Science.”
Social Science History 23: 501-535.
Methods: Review Ego-Networks.
1) Go over network drawing programs
2) Go over ego-network creation programs
3) Go over ego-network measures programs
4) Go over persons-through-groups creation programs

similar documents