### Session1 - Carnegie Mellon University

```Tetrad: Machine Learning and
Graphcial Causal Models
Richard Scheines
Joe Ramsey
Carnegie Mellon University
Peter Spirtes, Clark Glymour
1
Goals
1) Convey rudiments of graphical causal models
2) Basic working knowledge of Tetrad IV
2
Tetrad IV: Complete Causal Modeling Tool
3
1) Main website: http://www.phil.cmu.edu/projects/tetrad/
3) Data files:
4
Topic Outline
1) Motivation
2) Representing/Modeling Causal Systems
3) Estimation and Updating
4) Model Search
5) Linear Latent Variable Models
6) Case Study: fMRI
5
Statistical Causal Models: Goals
1) Policy, Law, and Science: How can we use data to answer
a) subjunctive questions (effects of future policy interventions), or
b) counterfactual questions (what would have happened had things
been done differently (law)?
c) scientific questions (what mechanisms run the world)
2) Rumsfeld Problem: Do we know what we do and don’t know: Can we
tell when there is or is not enough information in the data to answer
causal questions?
6
Causal Inference Requires More than Probability
Prediction from Observation ≠ Prediction from Intervention
P(Lung Cancer 1960 = y | Tar-stained fingers 1950 = no)
≠
P(Lung Cancer 1960 = y | Tar-stained fingers 1950 set = no)
In general: P(Y=y | X=x, Z=z) ≠ P(Y=y | Xset=x, Z=z)
Causal Prediction vs. Statistical Prediction:
Non-experimental data
(observational study)
P(Y,X,Z)
Background Knowledge
Causal Structure
P(Y=y | X=x, Z=z)
P(Y=y | Xset=x, Z=z)
7
Foundations of Causal Epistemology
Some Causal Structures can parameterize the
same set of probability distributions, some cannot
X
Y
Z
X
Y
Z
X
Y
Z
X
Y
Z
P1(X,YZ)
P2(X,YZ)
8
Causal Search
Causal Search:
1. Find/compute all the causal models that are
indistinguishable given background knowledge and data
2. Represent features common to all such models
Multiple Regression is often the wrong tool for Causal Search:
Example: Foreign Investment & Democracy
9
Foreign Investment
Does Foreign Investment in 3rd World Countries
inhibit Democracy?
Timberlake, M. and Williams, K. (1984). Dependence, political
exclusion, and government repression: Some cross-national
evidence. American Sociological Review 49, 141-146.
N = 72
PO
degree of political exclusivity
CV
lack of civil liberties
EN
energy consumption per capita (economic development)
FI
level of foreign investment
10
Foreign Investment
Correlations
fi
en
cv
po
-.175
-.480
0.868
fi
en
0.330
-.391
-.430
11
Case Study 1: Foreign Investment
Regression Results
po =
.227*fi
SE
t
(.058)
3.941
- .176*en + .880*cv
(.059)
-2.99
(.060)
14.6
Interpretation: foreign investment
increases political repression
12
Case Study 1: Foreign Investment
Alternatives
En
FI
CV
En
FI
CV
En
.31
-.23
FI
CV
.217
.88
-.176
PO
-.48
PO
Regression
.86
PO
Fit: df=2, 2=0.12,
p-value = .94
There is no model with testable constraints (df > 0)
in which FI has a positive effect on PO that is not
rejected by the data.
Outline
1) Motivation
2) Representing/Modeling Causal Systems
1) Causal Graphs
2) Standard Parametric Models
1) Bayes Nets
2) Structural Equation Models
3) Other Parametric Models
1) Generalized SEMs
2) Time Lag models
14
Causal Graphs
Causal Graph G = {V,E}
Each edge X  Y represents a direct causal claim:
X is a direct cause of Y relative to V
Years of
Education
Years of
Education
Income
Skills and
Knowledge
Income
15
Causal Graphs
Not Cause Complete
O m itted C au ses
Education
Income
Happiness
Common Cause Complete
O m itted
C o m m o n C au ses
Education
Income
Happiness
16
Modeling Ideal Interventions
Interventions on the Effect
Post
Pre-experimental System
Sweaters
On
17
Room
Temperature
Modeling Ideal Interventions
Interventions on the Cause
Post
Pre-experimental System
Sweaters
On
Room
Temperature
18
Interventions & Causal Graphs
Model an ideal intervention by adding an “intervention” variable
outside the original system as a direct cause of its target.
Pre-intervention graph
Education
Incom e
Taxes
Intervene on Income
“Hard” Intervention
E ducation
In com e
T ax es
I
“Soft” Intervention
E ducation
In com e
I
19
T ax es
Build and Save an acyclic causal graph:
1) with 3 measured variables, no latents
2) with at least 3 measured variables, and at least 1 latent
20
Parametric Models
21
Causal Bayes Networks
The Joint Distribution Factors
S m o k in g [0 ,1 ]
According to the Causal Graph,
Y e llo w F in g e rs
[0 ,1 ]
P (V ) 
Lung C ancer
[0 ,1 ]
 P( X
xV
P(S,YF, L) = P(S) P(YF | S) P(LC | S)
22
| Direct _ causes ( X ) )
Causal Bayes Networks
The Joint Distribution Factors
S m o k in g [0 ,1 ]
According to the Causal Graph,
Y e llo w F in g e rs
[0 ,1 ]
Lung C ancer
[0 ,1 ]
P (V ) 
 P( X
| Direct _ causes ( X ) )
xV
P(S) P(YF | S) P(LC | S) = f()
 = {1, 2,3,4,5, }
All variables binary [0,1]:
P(S = 0) = 1
P(S = 1) = 1 - 1
P(YF = 0 | S = 0) = 2
P(YF = 1 | S = 0) = 1- 2
P(YF = 0 | S = 1) = 3
P(YF = 1 | S = 1) = 1- 3
P(LC = 0 | S = 0) = 4
P(LC = 1 | S = 0) = 1- 4
P(LC = 0 | S = 1) = 5
P(LC = 1 | S = 1) = 1- 5
23
24
Structural Equation Models
E ducation
Causal Graph
Income
Longevity
 Structural Equations
For each variable X  V, an assignment equation:
X := fX(immediate-causes(X), eX)
 Exogenous Distribution: Joint distribution over the exogenous vars : P(e)
25
Linear Structural Equation Models
eEducation
Causal Graph
Path diagram
E ducation
Education
1
Income
Longevity
2
Income
Longevity
eIncome
eLongevity
Equations:
Education := eEducation
Income := Educationeincome
Longevity := EducationeLongevity
Exogenous Distribution:
P(eed, eIncome,eIncome )
- i≠j ei  ej (pairwise independence)
Structural Equation Model:
E.g.
V = BV + E
- no variance is zero
(eed, eIncome,eIncome ) ~N(0,2)
2 diagonal,
- no variance is zero
26
1) Interpret your causal graph with 3 measured variables with at
least 2 parametric models:
a) Bayes Parametric Model
b) SEM Parametric Model
2) Interpret your other graph with a parametric model of your
choice
27
Instantiated Models
28
1) Instantiate at least one Bayes PM with a Bayes IM
2) Instantiate at least one SEM PM with a SEM IM
3) Instantiate at least one SEM PM with a Standardized SEM IM
4) Generate two data sets (N= 50, N=5,000) for each
29
Outline
1) Motivation
2) Representing/Modeling Causal Systems
1) Causal Graphs
2) Standard Parametric Models
1) Bayes Nets
2) Structural Equation Models
3) Other Parametric Models
1) Generalized SEMs
2) Time Lag models
30
Generalized SEM
1) The Generalized SEM is a generalization of the linear SEM model.
2) Allows for arbitrary connection functions
3) Allows for arbitrary distributions
4) Simulation from cyclic models supported.
Hands On
1) Create a DAG.
2) Parameterize it as a Generalized SEM.
3) Open the Generalized SEM and select Apply Templates from the
4) Apply the default template to variables, which will make them all
linear functions.
5) For errors, select a non-Gaussian distribution, such as U(0, 1).
6) Save.
Time Series Simulation (Hands On)
1) Tetrad includes support for doing time series simulations.
2) First, one creates a time series graph.
3) Then one parameterizes the time series graph as a SEM.
4) Then one instantiates the SEM.
5) Then one simulates data from the SEM Instantiated Model.
Time Series Simulation
•
One can, e.g., calculate a vector auto-regression for it. (One can
do this as well from time series data loaded in.)
•
Attach a data manipulation box to the data.
•
Select vector auto-regression.
•
One can create staggered time series data
•
Attach a data manipulation box.
•
Select create time series data.
•
Should give the time lag graph with some extra edges in the
highest lag.
Estimation
35
1) Estimate one Bayes PM for which you have an IM and data
2) Estimate one SEM PM for which you have an IM and data
3) Import data from charity.txt, and build and estimate model two
models to estimate on those data
36
Hypothesis 1
Hypothesis 2
37
Updating
38