A c

Report
An Introduction to
Multivariate Models
Sarah Medland
SGDP Summer School
July 2010
•Univariate Model
Twin Model
•Bivariate Model
•Multivariate Models
Hypothesised
Sources of
Variation
Topics of Discussion:
Extensions to multiple variables
(3 or more)
Choosing between :
Model
Equations
Path
Diagrams
Matrix
Algebra
Path Tracing
Rules
Predicted Var/Cov
from Model
Cholesky Decomposition
Common Pathway model
Independent Pathway model
Observed Var/Cov
from Data
Structural Equation Modelling
(SEM)
Multivariate analysis
Univariate analysis: genetic and environmental
influences on the variance of one trait
Bivariate analysis: genetic and environmental
influences on the covariance between two traits
Multivariate analysis: genetic and environmental
basis of the covariance between multiple traits
Multiple phenotypes
Comorbid phenotypes
Diagnostic subtypes (e.g. anxiety: panic, social, separation)
Different dimensions (e.g. cognitive abilities)
Different raters
Self report, Mother-report, Teacher-report, Observational
Longitudinal data
Time-point 1, Time-point 2, Time-point 3, Time-point 4
Multivariate models
Specific
variance
Trait 1
Trait
2
Common
variance
Trait
3
Why do these phenotypes covary?
Multivariate models
Different models have different assumptions in the
nature of shared causes among multiple phenotypes
Cholesky Decomposition (Correlated factors solution)
Genetic and environmental factors on different variables
correlate
Independent pathway model
Specific and Common genetic and environmental causes
Common pathway model
Latent Psychometric factor mediates common genetic and
environmental effects
Cholesky Decomposition
Twin 1
A E
Mum
C
A E
Teacher
C
A E
Examiner
C
Twin 2
A E
A E
A E
A E
Child
Mum
Teacher
Examiner
C
C
C
Nvar Number of A, C and E Factors
C
A E
Child
C
The A Structure
Twin 1
A
Mum
A
Teacher
A
Examiner
Twin 2
A
Child
A
Mum
A
Teacher
A
Examiner
Number of A paths: [ nvar*(nvar+1) ] / 2
(4*5)/2 = 10
A
Child
The C Structure
Twin 1
Twin 2
Number of C paths: [ nvar*(nvar+1) ] / 2
(4*5)/2 = 10
Mum
Teacher
Examiner
Child
C
C
C
C
Mum
C
Teacher
Examiner
Child
C
C
C
The E Structure
Twin 1
E
Mum
E
Teacher
E
Examiner
Twin 2
E
Child
E
Mum
E
Teacher
E
Examiner
Number of E paths: [ nvar*(nvar+1) ] / 2
(4*5)/2 = 10
E
Child
Cholesky Decomposition
Twin 1
A E
A E
A E
Twin 2
rMZ = 1;rDZ = 0.5
A E
Mum
Teacher
Examiner
Child
C
C
C
C
A E
A E
A E
Mum
Teacher
Examiner
Child
C
C
C
C
rMZ/DZ = 1
A E
Correlated factors solution
am=am2
A
E
A
aT
Mum
A
C
C
am
E
Teacher
rA (M-T)
rA (M-E)
rA (M-C)
E
A
C
aE
Examiner
E
C
aC
Child
rA (T-E)
rA (T-C)
rA (E-C)
Correlated factors solution
cm=cm2
A
E
C
A
E
C
cm
Mum
A
E
C
rC (M-T)
rC (M-E)
rC (M-C)
E
C
cE
cT
Teacher
A
Examiner
cC
Child
rC (T-E)
rC (T-C)
rC (E-C)
Correlated factors solution
em=em2
A
E
C
A
E
C
em
Mum
A
E
A
C
rE (M-T)
rE (M-E)
rE (M-C)
C
eE
eT
Teacher
E
Examiner
eC
Child
rE (T-E)
rE (T-C)
rE (E-C)
Correlated factors solution
Assumptions
1. Each variable (e.g. Mother-rating) is influenced
by a set of genetic, shared and non-shared
environmental factors
2. The factors associated with each variable are
allowed to correlate with each other through rA, rC
and rE
3. Correlations among phenotypes are a function of
rA, rC and rE and the standardized A, C and E
paths connecting them
Independent & Common
Partition variance between variables:
1) Common variance:
variance that is shared by all measured variables
2) Specific variance:
variance that is not shared by the measured variables
S
Trait 1
C
Trait 2
S
C
C
C
Trait 3
S
What might common and specific variance represent?
1) Comorbid phenotypes: (e.g. anxiety subtypes)
Common variance = general liability to emotional reactivity
Specific variance = symptom-specific risks
2) Different raters: (e.g. mother, teacher, child reports)
Common variance = pervasive liability to reported behaviour
Specific variance = situation-specific behaviour
Independent pathway model
rMZ = 1 rDZ = 0.5
Twin 1
rMZ / DZ = 1
A
Teacher
Mum
C
E
Examiner
Child
C
A
C
A
E
Twin 2
E
Mum
C
Teacher
C
E
A
C
A
E
A
Examiner
Child
C
E
A
C
A
E
C
E
A
C
A
E
E
Independent pathway model
Assumptions
1. Each variable (e.g. mother rating) has variation
that is shared with other variables
“Common” genetic and environmental factors
2. Each variable is also influenced by unique
variance not shared with other variables
“Specific” genetic and environmental factors
3. Covariation among phenotypes may be due to
the same genetic or environmental causes
Independent pathway model
Example
To examine the etiology of comorbidity: e.g. the
separate symptom clusters of anxiety and
depression are influenced by the same genetic
factors
Conclusion: genes act largely in a non-specific
way to influence the overall level of psychiatric
symptoms. Separable anxiety and depression
symptom clusters in the general population are
largely the result of environmental factors
(Kendler KS et al., Arch Gen Psych, 1987)
Common pathway model
rMZ = 1 rDZ = 0.5
Twin 1
Twin 2
rMZ / DZ = 1
A
C
E
E
C
Latent
factor
fmum
fteacher
Teacher
Mum
Latent
factor
Examiner
A
A
E
fmum
fchild
fexaminer
Child
C
C
Mum
fteacher
Teacher
C
E
A
C
A
A
E
fchild
fexaminer
Examiner
Child
C
E
A
C
A
E
C
E
A
C
A
E
E
Common pathway model
Assumptions
1. Each variable (e.g. mother rating) has variation
that is shared with other variables and variation
that is specific
2. Common genetic and environmental variance is
captured by a latent psychometric factor (e.g.
pervasive or situation independent behaviour;
general liability to anxiety)
3. Covariation among phenotypes is due to the
effects of the common psychometric factor on
each variable
Common pathway model
Example
To study the etiology of Comorbidity: e.g.
conduct disorder, ADHD, substance
experimentation and novelty seeking, used as
indices of a latent behavioral disinhibition trait
> h2 =0.84
Conclusion: a variety of adolescent problem
behaviours may share a common underlying
genetic risk
[Young et al., Am. J. Med. Genet. (Neuropsychiatric Genet.), 2000].
Observed Statistics and df
 (Theoretical) Observed Summary Statistics
 Maximum Likelihood Analysis using summary matrices as
input
 Theoretical degrees of freedom

N observed summary statistics – N estimated parameters
 (Actual) Observed Statistics
 Full Maximum Likelihood Analysis using raw data as input


Number of available data points
(Actual) degrees of freedom

N observed statistics – N estimated parameters
Exercise 1
We will run these models on the 4 antisocial measures
1. How many observed summary statistics will there be?
a. Consider size of observed variance-covariance matrix
b. Consider number of observed means
Note: there are MZ and DZ pairs
2. How many parameters are estimated for
a. Cholesky Decomposition?
b. Independent pathway model?
c. Common pathway model?
Note: don’t forget means are estimated in each model too
3. How many theoretical degrees of freedom will each
model have?
Note: The variance of the common latent factor is constrained to be 1
Note: Df = observed statistics – estimated parameters
Variance - Covariance Matrix
8 x 8 matrix
T1V1
T1V2 T1V3
T1V4
T2V1
T2V2 T2V3 T2V4
T1V1
var
T1V2
cov
var
T1V3
cov
cov
var
T1V4
cov
cov
cov
var
T2V1
cov
cov
cov
cov
var
T2V2
cov
cov
cov
cov
cov
var
T2V3
cov
cov
cov
cov
cov
cov
var
T2V4
cov
cov
cov
cov
cov
cov
cov
var
Variance - Covariance Matrix
MZ Twins
8 x 8 symmetrical matrix
= (8x9)/2= 36 summary statistics
DZ Twins
8 x 8 symmetrical matrix
= (8x9)/2= 36 summary statistics
Total number = 72
Observed means
1 x 8 matrix
T1V1
T1V2
T1V3
T1V4
T2V1
T2V2
T2V3
T2V4
Mean1 Mean2 Mean3 Mean4 Mean1 Mean2 Mean3 Mean4
MZ Twins
1 x 8 Full matrix = 8 summary statistics
DZ Twins
1 x 8 Full matrix = 8 summary statistics
Total number = 16
Total number of Summary statistics
Summary statistics = variance-covariances + means
Summary statistics = 72 + 16=
88
Parameter estimates and DF
Model
Cholesky
Number of
estimated
parameters
??
Observed
Summary
statistics
88
Independent
pathway
88
Common
pathway
88
Theoretical
degrees of
freedom
Cholesky Decomposition
A E
A E
A E
A E
Mum
Teacher
Examiner
Child
C
C
C
C
A E
A E
A E
Mum
Teacher
Examiner
Child
C
C
C
C
(4x5)/2 =10 for A, C and E = 30
means (equated for birth-order
and) zygosity = 4
A E
Parameter estimates and DF
Model
Cholesky
Number of
estimated
parameters
34
Observed
summary
statistics
88
Independent ??
pathway
88
Common
pathway
88
Theoretical
degrees of
freedom
54
Independent pathway model
Twin 1
Twin 2
A
C
E
E
C
A
12
Teacher
Mum
Examiner
Child
Mum
Teacher
Examiner
Child
12
C
A
C
A
E
C
E
A
C
A
E
C
E
A
C
A
+4 means = 4
E
C
E
A
C
A
E
E
Parameter estimates and DF
Model
Observed
summary
statistics
88
Theoretical
degrees of
freedom
54
Independent 28
Pathway
88
48
Common
Pathway
88
Cholesky
Number of
estimated
parameters
34
??
Common pathway model
rMZ = 1 rDZ = 0.5
Twin 1
Twin 2
rMZ / DZ = 1
A
C
E
E
C
A
3
Latent
factor
fmum
fteacher
Teacher
Mum
C
A
C
A
E
Latent
factor
4Examiner
Child
A
C
A
Mum
E
E
C
A
+4 means = 16
fteacher
Teacher
fexaminer
Examiner
fchild
Child
NB. There is a
C
C
constraint – how is it
Aaccounted
E
A Mx ?
E
C for in
C
12
E
fmum
fchild
fexaminer
E
A
E
Parameter estimates and DF
Model
Observed
summary
statistics
88
Theoretical
degrees of
freedom
54
Independent 28
pathway
88
60
Common
pathway
89 (88+1)
66
Cholesky
Number of
estimated
parameters
34
23
Comparing models
Correlated
Factors
Independent
pathway
Common
pathway
→
Most restricted →
Most parsimonious →
Fewest parameters
openMx Script for
Independent Pathway model
Note: Scripts are at the end of these slides in your handout
Independent pathway model
rMZ = 1 rDZ = 0.5
Twin 1
rMZ / DZ = 1
A
Teacher
Mum
C
E
Examiner
Child
C
A
C
A
E
Twin 2
E
Mum
C
Teacher
C
E
A
C
A
E
A
Examiner
Child
C
E
A
C
A
E
C
E
A
C
A
E
E
Script
nvar <- 4
#number of variables
nf <- 1
#number of factors
ACE_Independent_Model <- mxModel("ACE_Independent",
mxModel("ACE",
mxMatrix( type="Full", nrow=nvar, ncol=nf, free=TRUE,
values=.6, name="ac" ),
mxMatrix( type="Full", nrow=nvar, ncol=nf, free=TRUE,
values=.6, name="cc" ),
mxMatrix( type="Full", nrow=nvar, ncol=nf, free=TRUE,
values=.6, name="ec" ),
rMZ = 1 rDZ = 0.5
rMZ / DZ = 1
Twin 1
A
C
acv1 acv2 acv3
V1
V2
E
Twin 2
E
C
acv4
V3
A
acv1 acv2 acv3 acv4
V4
V1
V2
V3
Matrix ac = path coefficient additive genetic
parameters of the common A factor
Ac
acv1
Variable 1
acv2
ac =
Variable 2 Full 4x1
acv3
Variable 3
acv4
Variable 4
V4
rMZ = 1 rDZ = 0.5
Twin 1
Twin 2
rMZ / DZ = 1
A
C
E
E
ccv1 ccv2 ccv3 ccv4
V1
V2
V3
C
A
ccv1 ccv2 ccv3 ccv4
V4
V1
V2
V3
Matrix cc = path coefficient shared environmental
parameters of the common C factor
Cc
ccv1
Variable 1
ccv2
cc =
Variable 2 Full 4x1
ccv3
Variable 3
ccv4
Variable 4
V4
rMZ = 1 rDZ = 0.5
Twin 1
rMZ / DZ = 1
A
C
E
E
ecv2
V2
V3
C
A
ecv4
ecv1
V1
Twin 2
ecv3 ecv4
V4
ecv1 ecv2 ecv3
V1
V2
V3
V4
Matrix ec = path coefficient non-shared environmental
parameters of the common E factor
Ec
ecv1
Variable 1
ecv2
ec =
Variable 2 Full 4x1
ecv3
Variable 3
ecv4
Variable 4
Script
mxMatrix( type="Diag", nrow=nvar, ncol=nvar, free=TRUE,
values=4, name="as" ),
mxMatrix( type="Diag", nrow=nvar, ncol=nvar, free=TRUE,
values=4, name="cs" ),
mxMatrix( type="Diag", nrow=nvar, ncol=nvar, free=TRUE,
values=5, name="es" ),
Matrix as= path coefficients genetic parameters of the
specific A factors
AV1 AV2 AV3 AV4
as =
V2
V1
asV1
asV2
Variable 1
0
asV2
0
0
asV3
Variable 2
Variable 3
0
0
0
V3
A
E
asV4
A
C
A
E
E
V4
asV4
C
A
C
A
V3
asV3
asV2
asV1
C
E
V2
V1
asV4
asV3
Diag 4x4
Variable 4
V4
C
C
A
asV1
E
Cross-twin cor between A and C factors omitted
C
E
A
C
A
E
E
Matrix cs= path coefficients shared environment
parameters of the specific C factors
CV1 CV2 CV3 CV4
cs =
V2
V1
csV1
Variable 1
0
csV2
0
0
csV3
Variable 2
Variable 3
0
0
0
V3
A
C
A
E
E
A
A
E
V3
V4
csV2
csV1
C
C
V2
V1
csV4
csV3
C
Variable 4
V4
csV2
csV1
csV4
Diag 4x4
E
A
csV3
C
A
C
E
csV4
E
C
A
C
A
E
E
Matrix es= path coefficients non-shared environment
parameters of the specific E factors
EV1 EV2 EV3 EV4
es =
V2
V1
esV1
A
C
A
C
E
esV1
Variable 1
0
esV2
0
0
esV3
Variable 2
Variable 3
0
0
0
V3
esV2
E
Variable 4
V4
esV3
C
A
C
A
esV4
E
Diag 4x4
V2
V1
esV4
esV1
E
A
C
A
E
C
V3
esV2
E
V4
esV3
A
C
A
C
E
esV4
E
Script
mxAlgebra(ac %*% t(ac) + as %*% t(as), name="A" ),
mxAlgebra(cc %*% t(cc) + cs %*% t(cs), name="C" ),
mxAlgebra(ec %*% t(ec) + es %*% t(es), name="E" ),
Matrix A= variance components of the common A factor
plus variance components of the specific A factors
ac %*% t(ac) + as %*% t(as)
Ac
acv1
acv2
ac =
acv3
acv4
Variable 1
Variable 2
Variable 3
Variable 4
Full 4x1
ac %*% t(ac)= 4x1 * 1x4
=
ac2V1
acV1acV2
acV1acV3
acV1acV4
acV2acV1
ac2V2
acV2acV3
acV2acV4
acV3acV1
acV3acV2
ac2V3
acV3acV4
acV4acV1
acV4acV2
acV4acV3
ac2V4
4x4
ac %*% t(ac) + as %*% t(as)
AV1 AV2 AV3 AV4
as =
asV1 0
0
0
0
asV2
0
0
0
0
asV3
0
0
0
0
Variable 1
Variable 2
Variable 3
asV4
Diag 4x4
Variable 4
as %*% t(as) = 4x4 * 4x4
as2V1 0
=
0
0
0
0
as2V2 0 0
0
0
0
as2V3
0
0
0
0
as2V4
4x4
Matrix A= variance components of the common A factor
plus variance components of the specific A factors
ac %*% t(ac) + as %*% t(as)
ac2V1+ as2V1 acV1acV2
A=
acV1acV3
acV1acV4
acV2acV1
ac2V2 + as2V2 acV2acV3
acV2acV4
acV3acV1
acV3acV2
ac2V3 + as2V3 acV3acV4
acV4acV1
acV4acV2
acV4acV3
ac2V4 + as2V4
4x4
Matrix A= variance components of the common A factor
plus variance components of the specific A factors
ac %*% t(ac) + as %*% t(as)
ac2V1+ as2V1 acV1acV2
A=
acV1acV3
acV1acV4
acV2acV1
ac2V2 + as2V2 acV2acV3
acV2acV4
acV3acV1
acV3acV2
ac2V3 + as2V3 acV3acV4
acV4acV1
acV4acV2
acV4acV3
ac2V4 + as2V4
4x4
Matrix C= variance components of the common C factor
plus variance components of the specific C factors
cc %*% t(cc) + cs %*% t(cs)
cc2V1+ cs2V1 ccV1ccV2
C=
ccV1ccV3
ccV1ccV4
ccV2ccV1
cc2V2 + cs2V2 ccV2ccV3
ccV2ccV4
ccV3ccV1
ccV3ccV2
cc2V3 + cs2V3 ccV3ccV4
ccV4ccV1
ccV4ccV2
ccV4ccV3
cc2V4 + cs2V4
4x4
Matrix E= variance components of the common E factor
plus variance components of the specific E factors
ec %*% t(ec) + es %*% t(es)
ec2V1+ es2V1 ecV1ecV2
E=
ecV1ecV3
ecV1ecV4
ecV2ecV1
ec2V2 + es2V2 ecV2ecV3
ecV2ecV4
ecV3ecV1
ecV3ecV2
ec2V3 + es2V3 ecV3ecV4
ecV4ecV1
ecV4ecV2
ecV4ecV3
ec2V4 + es2V4
4x4
Script
mxAlgebra(A+C+E, name="V" ),
mxMatrix( type="Iden", nrow=nvar, ncol=nvar,
name="I"),
mxAlgebra( solve(sqrt(I*V)), name="iSD"),
mxMatrix( type="Full", nrow=1, ncol=nvar,
free=TRUE, values= 0, name="M" ),
mxAlgebra( cbind(M,M), name="expMean"),
Matrix V =
 Total variance/covariance matrix of
measured variables
 V = A+C+E

V[1,1] = ac2V1+ as2V1+ cc2V1+ cs2V1+ ec2V1+ es2V1

V[2,1] = acV2asV1+ ccV2csV1+ ecV2esV1

V[3,1] = acV3asV1+ ccV3csV1+ ecV3esV1

...
iSD
mxMatrix( type="Iden", nrow=nvar, ncol=nvar,
name="I"),
mxAlgebra( solve(sqrt(I*V)), name="iSD"),



Multiply the var/cov matrix by an identity matrix
Yeilds a matrix with variances on the diagonals & zeros on the
off diagonals
The inverse of this yeilds a matrix with standard deviations on
the diagonals & zeros on the off diagonals
1
I=
0
var1 cov12
0
1
V=
cov12 var2
var1 0
I*V=
0
var2
sd1 0
iSD=
0
sd2
Script
mxAlgebra( rbind ( cbind(A+C+E , A+C),
cbind(A+C , A+C+E)),
name="expCovMZ" ),
mxAlgebra( rbind ( cbind(A+C+E , 0.5%x%A+C),
cbind(0.5%x%A+C , A+C+E)),
name="expCovDZ" )
),
Total Common variance/covariance:
A+C+E
ACEvarv1 ACEcov12 ACEcov13 ACEcov14
ACEcov21 ACEvarv2 ACEcov23 ACEcov24
ACEcov31 ACEcov32 ACEvarv3 ACEcov34
ACEcov41 ACEcov42 ACEcov43 ACEvarv4
Common covariance between MZ twins:
A+C
ACvarv1
ACcov12
ACcov13
ACcov14
ACcov21
ACvarv2
ACcov23
ACcov24
ACcov31
ACcov32
ACvarv3
ACcov34
ACcov41
ACcov42
ACcov43
ACvarv4
Common covariance between DZ twins:
.5A+C
.5ACvarv1 .5ACcov12 .5ACcov13 .5ACcov14
.5ACcov21 .5ACvarv2 .5ACcov23 .5ACcov24
.5ACcov31 .5ACcov32 .5ACvarv3 .5ACcov34
.5ACcov41 .5ACcov42 .5ACcov43 .5ACvarv4
A+C+E
A+C
T1V1
T1V2
T2V3
A+C
A+C+E
T1V4
T2V1
T2V2
T2V3
T2V4
T1V1
ACEvarV1 ACEcov ACEcov ACEcov
ACvarV1 ACcov ACcov
ACcov
T1V2
ACEcov ACEvarV2 ACEcov ACEcov
ACcov
ACcov
T1V3
ACEcov ACEcov ACEvarV3 ACEcov
ACcov ACcov ACvarV3 ACcov
T1V4
ACEcov ACEcov ACEcov ACEvarV4
ACcov ACcov ACcov ACvarV4
ACvarV2 ACcov
T2V1
ACvarV1 ACcov ACcov
ACcov
ACEvarV1 ACEcov ACEcov ACEcov
T2V2
ACcov
ACcov
ACEcov ACEvarV2 ACEcov ACEcov
T2V3
ACcov ACcov ACvarV3 ACcov
ACEcov ACEcov ACEvarV3 ACEcov
T2V4
ACcov ACcov ACcov ACvarV4
ACEcov ACEcov ACEcov ACEvarV4
ACvarV2 ACcov
A+C+E
.5A+C
T1V1
T1V2
T2V3
T1V4
.5A+C
A+C+E
T2V1
T2V2
T2V3
T2V4
T1V1
ACEvarV1 ACEcov ACEcov ACEcov.5ACvarV1 .5ACcov .5ACcov .5ACcov
T1V2
ACEcov ACEvarV2 ACEcov ACEcov.5ACcov .5ACvarV2 .5ACcov .5ACcov
T1V3
ACEcov ACEcov ACEvarV3 ACEcov.5ACcov .5ACcov .5ACvarV3 .5ACcov
T1V4
ACEcov ACEcov ACEcov ACEvarV4.5ACcov .5ACcov .5ACcov
.5ACvarV4
T2V1 .5ACvar .5ACcov .5ACcov .5ACcov ACEvar ACEcov ACEcov ACEcov
V1
V1
T2V2 .5ACcov .5ACvarV2 .5ACcov .5ACcov ACEcov ACEvarV2 ACEcov ACEcov
cov .5ACcov .5ACvar
cov ACEcov ACEcov ACEvar
cov
.5AC
.5AC
ACE
V3
V3
T2V3
cov .5ACcov .5ACcov
var ACEcov ACEcov ACEcov ACEvar
.5AC
.5AC
V4
V4
T2V4
Script
mxModel("MZ",
mxData( observed=mzData, type="raw" ),
mxFIMLObjective( covariance="ACE.expCovMZ",
means="ACE.expMean", dimnames=selVars )
),
Script
mxModel(“DZ",
mxData( observed=dzData, type="raw" ),
mxFIMLObjective( covariance="ACE.expCovDZ",
means="ACE.expMean", dimnames=selVars )
),
Mx Script for
Common pathway model
Note: Scripts are at the end of these slides in your handout
Common pathway model
rMZ = 1 rDZ = 0.5
Twin 1
Twin 2
rMZ / DZ = 1
A
C
E
E
C
Latent
factor
fmum
fteacher
Teacher
Mum
Latent
factor
Examiner
A
A
E
fmum
fchild
fexaminer
Child
C
C
Mum
fteacher
Teacher
C
E
A
C
A
A
E
fchild
fexaminer
Examiner
Child
C
E
A
C
A
E
C
E
A
C
A
E
E
Script
ACE_Common_Model <- mxModel("ACE_Common",
mxModel("ACE",
mxMatrix( type="Lower", nrow=nf, ncol=nf,
free=TRUE, values=.6, name="al" ),
mxMatrix( type="Lower", nrow=nf, ncol=nf,
free=TRUE, values=.6, name="cl" ),
mxMatrix( type="Lower", nrow=nf, ncol=nf,
free=TRUE, values=.6, name="el" ),
rMZ = 1 rDZ = 0.5
rMZ / DZ = 1
A
C
E
E
C
A
al
al
Latent
factor
1
Latent
factor
1
Matrix al=path coefficient for A parameter on common
latent factor
al =
A
al
Latent factor
Full 1x1
rMZ = 1 rDZ = 0.5
rMZ / DZ = 1
A
C
E
cl
1
el
Latent
factor
E
C
el
A
cl
Latent
factor
1
Matrix cl=path coefficient for C parameter on common
C
latent factor
cl =
cl
Common latent factor Full 1x1
Matrix el=path coefficient for E parameter on common
E
latent factor
el =
el
Common latent factor Full 1x1
Script
mxAlgebra( al %*% t(al) + cl %*% t(cl) + el %*% t(el),
name="CovarLP" ),
mxAlgebra( diag2vec(CovarLP), name="VarLP" ),
mxMatrix( type="Unit", nrow=nf, ncol=1, name="Unit"),
mxConstraint (VarLP == Unit),
CovarLP= variance of the latent factor
al %*% t(al) + cl %*% t(cl) + el %*% t(el)
A
al =
Common latent factor Full 1x1
al
X*X’ = 1x1 * 1x1
A=
a2l
1x1
CovarLP= variance of the latent factor
al %*% t(al) + cl %*% t(cl) + el %*% t(el)
CovarLP =
a2l+c2l+e2l
1x1
mxAlgebra( diag2vec(CovarLP), name="VarLP" )
VarLP =
a2l+c2l+e2l
1x1
mxConstraint (VarLP == Unit),
VarLP =
a2l+c2l+e2l
=1
Script
mxMatrix( type="Diag", nrow=nvar, ncol=nvar,
free=TRUE, values=.4, name="as" ),
mxMatrix( type="Diag", nrow=nvar, ncol=nvar,
free=TRUE, values=.4, name="cs" ),
mxMatrix( type="Diag", nrow=nvar, ncol=nvar,
free=TRUE, values=.5, name="es" ),
Matrix as= path coefficients genetic parameters of the
specific A factors
AV1 AV2 AV3 AV4
as =
V2
V1
asV1
asV2
Variable 1
0
asV2
0
0
asV3
Variable 2
Variable 3
0
0
0
V3
A
E
asV4
A
C
A
E
E
V4
asV4
C
A
C
A
V3
asV3
asV2
asV1
C
E
V2
V1
asV4
asV3
Diag 4x4
Variable 4
V4
C
C
A
asV1
E
Cross-twin cor between A and C factors omitted
C
E
A
C
A
E
E
Matrix cs= path coefficients shared environment
parameters of the specific C factors
CV1 CV2 CV3 CV4
cs =
V2
V1
csV1
Variable 1
0
csV2
0
0
csV3
Variable 2
Variable 3
0
0
0
V3
A
C
A
E
E
A
A
E
V3
V4
csV2
csV1
C
C
V2
V1
csV4
csV3
C
Variable 4
V4
csV2
csV1
csV4
Diag 4x4
E
A
csV3
C
A
C
E
csV4
E
C
A
C
A
E
E
Matrix es= path coefficients non-shared environment
parameters of the specific E factors
EV1 EV2 EV3 EV4
es =
V2
V1
esV1
A
C
A
C
E
esV1
Variable 1
0
esV2
0
0
esV3
Variable 2
Variable 3
0
0
0
V3
esV2
E
Variable 4
V4
esV3
C
A
C
A
esV4
E
Diag 4x4
V2
V1
esV4
esV1
E
A
C
A
E
C
V3
esV2
E
V4
esV3
A
C
A
C
E
esV4
E
Script
mxMatrix( type="Full", nrow=nvar, ncol=nf,
free=TRUE, values=1, name="f" ),
f = Loadings that partition the proportion of variance /
covariance due to the latent factor
Common Latent factor
f=
fv1
Variable 1
fv2
fv3
Variable 2
Variable 3
fv4
Variable 4
Full 4x1
Latent
factor
f v1
fv2
fv4
fv3
V2
V1
Latent
factor
V3
V4
C
A
C
A
E
fv1
fv2
V2
V1
A
C
A
E
V3
V4
C
C
E
fv4
fv3
E
A
C
A
E
C
E
A
C
A
E
E
Script
mxAlgebra( f %&% (al %*% t(al)) + as %*% t(as),
name="A" ),
mxAlgebra( f %&% (cl %*% t(cl)) + cs %*% t(cs),
name="C" ),
mxAlgebra( f %&% (el %*% t(el)) + es %*% t(es),
name="E" ),
Matrix A= variance components of the common A factor
plus variance components of the specific A factors
f %&% (al %*% t(al)) + as %*% t(as)
Common Latent factor
Full 4x1
fv1
f=
Latent A factor
al =
al
Full 1x1
fv2
fv3
fv4
fv1
=
fv2
fv3
fv4
*
al
*
al
* fv1 fv2 fv3 fv4
Matrix A= variance components of the common A factor
plus variance components of the specific A factors
f %&% (al %*% t(al)) + as %*% t(as)
fv1
=
fv2
fv3
*
al
*
al
* fv1 fv2 fv3 fv4
fv4
=
fv12al2
fv1fv2al2
fv1fv3al2 fv1fv4al2
fv2fv1al2
fv22al2
fv2fv3al2 fv2fv4al2
fv3fv1al2
fv3fv2al2
fv32al2
fv3fv4al2
fv4fv1al2
fv4fv2al2
fv4fv3al2
fv42al2
f %&% (al %*% t(al)) + as %*% t(as)
AV1 AV2 AV3 AV4
as =
asV1 0
0
0
0
asV2
0
0
0
0
asV3
0
0
0
0
Variable 1
Variable 2
Variable 3
asV4
Diag 4x4
Variable 4
as %*% t(as) = 4x4 * 4x4
as2V1 0
=
0
0
0
0
as2V2 0 0
0
0
0
as2V3
0
0
0
0
as2V4
4x4
Matrix A= variance components of the common A factor
plus variance components of the specific A factors
f %&% (al %*% t(al)) + as %*% t(as)
=
fv12al2+as12
fv1fv2al2
fv1fv3al2
fv1fv4al2
fv2fv1al2
fv22al2+as22
fv2fv3al2
fv2fv4al2
fv3fv1al2
fv3fv2al2
fv32al2+as32
fv3fv4al2
fv4fv1al2
fv4fv2al2
fv4fv3al2
fv42al2 +as42
Matrix C= variance components of the common C factor
plus variance components of the specific C factors
f %&% (cl %*% t(cl)) + cs %*% t(cs)
=
fv12cl2+cs12
fv1fv2cl2
fv1fv3cl2
fv1fv4cl2
fv2fv1cl2
fv22cl2+cs22
fv2fv3cl2
fv2fv4cl2
fv3fv1cl2
fv3fv2cl2
fv32cl2+cs32
fv3fv4cl2
fv4fv1cl2
fv4fv2cl2
fv4fv3cl2
fv42cl2 +cs42
Matrix E= variance components of the common E factor
plus variance components of the specific E factors
f %&% (el %*% t(el)) + es %*% t(es)
=
fv12el2+es12
fv1fv2el2
fv1fv3el2
fv1fv4el2
fv2fv1el2
fv22el2+es22
fv2fv3el2
fv2fv4el2
fv3fv1el2
fv3fv2el2
fv32el2+es32
fv3fv4el2
fv4fv1el2
fv4fv2el2
fv4fv3el2
fv42el2 +es42
Script
mxAlgebra( A+C+E, name="V" ),
mxMatrix( type="Iden", nrow=nvar, ncol=nvar,
name="I"),
mxAlgebra( solve(sqrt(I*V)), name="iSD"),
mxMatrix( type="Full", nrow=1, ncol=nvar, free=TRUE,
values= 0, name="M" ),
mxAlgebra( cbind(M,M), name="expMean"),
Script
mxAlgebra( rbind ( cbind(A+C+E , A+C),
cbind(A+C , A+C+E)),
name="expCovMZ" ),
mxAlgebra( rbind ( cbind(A+C+E , 0.5%x%A+C),
cbind(0.5%x%A+C , A+C+E)),
name="expCovDZ" )
),
Script
mxModel("MZ",
mxData( observed=mzData, type="raw" ),
mxFIMLObjective( covariance="ACE.expCovMZ",
means="ACE.expMean", dimnames=selVars )
),
Script
mxModel("DZ",
mxData( observed=dzData, type="raw" ),
mxFIMLObjective( covariance="ACE.expCovDZ",
means="ACE.expMean", dimnames=selVars )
),
Applications of multivariate genetic
modelling
Note: instructions and answer sheets for the practical
session are at the end of these slides in your handout
Sample characteristics
Aims of E-risk longitudinal study project:
To study etiology of Antisocial behaviour in 5-year-old twins (Erisk longitudinal study) using 4 independent sources of
information
Sample:
451 MZ twin pairs, 389 same-sex DZ twin pairs
Data:
Mothers, Teachers, Examiner-observer reports
(e.g. CBCL and DSM-IV items)
Child self-report (Berkeley Puppet Interview)
Twin 1 variables:
Twin 2 variables:
Mr1 Tr1 Er1 Sr1
Mr2 Tr2 Er2 Sr2
Research questions
Previous studies
Early childhood antisocial behaviour is a strong
prognostic indicator for poor adult mental
health. However, genetic ethiology is unknown
Current aims:
To examine the heritability of AB in children
using multiple informants to account for bias
Current analyses:
Test 3 multivariate models to examine the factor
structure common to the 4 measures of AB
Task 1: Preliminary analyses
1.
2.
3.
Within-twin cross-measure correlations
Twin correlations within-measure
Cross-twin cross-measure correlations
Task 2: Model-fitting analyses
1.
2.
3.
4.
Run the saturated model (to obtain -2LL statistics
of multivariate models)
Test all three multivariate models, at the end of the
common pathway model run the model
comparison syntax and select the model of best-fit
(from comparison of sub-models)
Fill in the parameter estimates for each model in
path diagrams
Run confidence intervals for only the selected
elements (see handout)
Goodness-of-Fit measures
 AIC (Akiake Information Criterion) = 2 - 2df
 AIC is founded on ideas from information theory, gives a GoF
measure that penalises models for increasing complexity.
 AIC can be used for nonnested models.
 In comparing two models AIC = AICi – AICmin, where AICi is the
AIC value for model i, and AICmin is the AIC value of the ‘best’
model (Burnham and Anderson, 2002). As a rule of thumb:
∆AIC
Evidence



<2
3− 7
> 10
suggests substantial evidence for the model i
model i has considerably less support
model i is very unlikely
Goodness-of-Fit measures
 BIC (Bayesian Information Criterion) = 2 - [df*ln(n)]
 Takes sample size into account with increasingly negative values
corresponding to increasingly better fitting models.
 BIC can be used to compare nonnested models.
 In comparing two models the differences in BIC gives an
estimate of the strength of evidence in favour of the model with
the smaller BIC value (Raftery, 1995)
Grades of Evidence Corresponding to BIC difference between
Model 1 and Model 2:
BIC Difference
0−2
2−6
6 − 10
> 10
Evidence
Weak
Positive
Strong
Very Strong
Overall Goodness of Fit
 Mx also produces a sample size adjusted BIC and Deviance





Information Criterion.
AIC, BIC and DIC are useful for testing non-nested models.
For example comparing a model with genetic dominance
(ADE) to one with shared environment effects (ACE).
Models with lower values are preferred.
Markon & Krueger carried out a simulation study that
indicated that BIC was preferable to AIC (especially with
large samples and complex models).
One hopes that the various fit statistics will converge to
select the same best fitting model.
References
 Raftery, AE. (1995). Bayesian model selection in
social research. Sociological Methodology (ed. PV.
Marsden), Oxford, U.K.: Blackwells, pp. 111-196.
 Burnham, KP & Anderson, DR. (2002). Model
Selection and Multimodel Inference: a practical
information-theoretic approach, 2nd edition.
Springer-Verlag, New York.,
 Markon, KE & Kreuger,RF. (2004). Beh Gen, 34, pg
593 – 610.
Standardized Estimates
e.g. IP model
### Generate Multivariate Cholesky ACE Output ###
parameterSpecifications(ACE_Cholesky_Fit)
expectedMeansCovariances(ACE_Cholesky_Fit)
tableFitStatistics(ACE_Cholesky_Fit)
ACEpathMatrices <c("ACE.a","ACE.c","ACE.e","ACE.iSD","ACE.iSD %*%
ACE.a","ACE.iSD %*% ACE.c","ACE.iSD %*% ACE.e")
ACEpathLabels <c("pathEst_a","pathEst_c","pathEst_e","sd","stPathEst_a","stPat
hEst_c","stPathEst_e")
formatOutputMatrices(ACE_Cholesky_Fit,ACEpathMatrices,ACE
pathLabels,Vars,4)
References
Hewitt JK, Silberg JL, Neale MC, Eaves LJ, Erickson M. (1992).
The analysis of parental ratings of children’s behavior using
LISREL. Behavior Genetics, 22, 293-317.
van den Oord EJCG, Boomsma DI, & Verhulst FC (2000). A
study of genetic and environmental effects on the cooccurrence of problem behaviors in three-year-old twins.
Journal of Abnormal Psychology, 109, 360-372.
Arseneault, L., Moffitt, T.E., Caspi, A., Taylor, A., Rijsdijk, F.V.,
Jaffee, S., Ablow, J.C., & Measelle, J.R. (2003). Strong genetic
effects on cross-situational antisocial behavior among 5-yearold children according to mothers, teachers, examinerobservers, and twins’ self-reports. Journal of Child
Psychology and Psychiatry, 44, 832-848.

similar documents