Log-linear and Canonical Correlation Analysis

Report
Association log-linear analysis
and canonical correlation
analysis
Chapter 9
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
1
Association between qualitative
variables
• Association is a generic term referring to the relationship of
two variables
• Correlation measures strictly refer to quantitative variables
• Thus, association, generally refers to qualitative variables
• Two qualitative variables are said to be associated when
changes in one variable lead to changes in the other
variable (i.e. they are not independent. For example,
education is generally associated with job position.
• Association measures for categorical variables are based on
tables of frequencies, also termed contingency tables
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
2
Contingency tables
• Frequency tables show the joint frequencies of two
categorical variables
• The marginal totals, that is the row and column
totals of the contingency table, represent the
univariate frequency distribution for each of the
two variables
• If these variables are independent one would
expect that the distribution of frequencies across
the internal cells of the contingency table only
depends on the marginal totals and the sample size
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
3
Contingency table (frequencies)
Food & non-alcoholic beverage (Binned) * Anonymised hhold inc + allowances (Banded) Crosstabulation
Food & non-alcoholic
beverage (Binned)
£ 20 or less
From £ 20 to £ 40
From £ 40 to £ 60
More than £ 60
Total
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
Count
% of Total
Count
% of Total
Count
% of Total
Count
% of Total
Count
% of Total
Anonymised hhold inc + allowances (Banded)
Medium-low
Medium-high
Low income
income
income
High income
47
19
18
4
9.4%
3.8%
3.6%
.8%
57
48
24
22
11.4%
9.6%
4.8%
4.4%
17
31
45
40
3.4%
6.2%
9.0%
8.0%
4
27
38
59
.8%
5.4%
7.6%
11.8%
125
125
125
125
25.0%
25.0%
25.0%
25.0%
Total
88
17.6%
151
30.2%
133
26.6%
128
25.6%
500
100.0%
4
Independent variables
• In probability terms, two events are regarded as independent
when their joint probability is the product of the probabilities
of the two individual events
Prob(X=a,Y=b)=Prob(X=a)Prob(Y=b)
• Similarly, two categorical variables are independent when the
joint probability of two categorical outcomes is equal to the
product of the probabilities of the individual outcomes for
each variable
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
5
Expected frequencies under
independence
• Thus, the frequencies within the contingency table should
not be too different from these expected values:
ni 0  n0 j fi 0  f0 j
*
fij 

 fi 0  f 0 j
n00
f00
where
• nij and fij are the absolute and relative frequencies,
respectively
• ni0 and n0j (or fi0 and f0j) are the marginal totals for row i
and column j, respectively
• n00 is the sample size (hence the total relative frequency
f00 equals one).
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
6
Independence and association – testing
• The more that empirical frequencies are at
a distance from the expected frequency
under independence, the more the two
categorical variables are associated.
• Thus, a synthetic measure of association is
given by
( f ij  f ij* ) 2
2  
i, j
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
f ij*
7
The Chi-square statistic
• The more distant the actual joint frequencies are from the
expected ones, the larger is the Chi-square statistics
• Under the independence assumption, the chi-square statistic
has a known probability distribution, so that its empirical
values can be associated with a probability value to test
independence
• The observed frequency values may differ from the expected
values fij* because of random errors, so that the discrepancy
can be tested using a statistical tool, the Chi-square
distribution
• As usual, the basic principle is also to measure the probability
that the discrepancy between the expected and observed value
is due to randomness only
• If this probability value (from the Chi-square theoretical
distribution) is very low (below the significance threshold),
then one rejects the null hypothesis of independence between
the two variable and proceed assuming some association
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
8
Other association measures
• contingency coefficient: ranges from zero (independence)
to values close to one for strong association, but its value
depends on the shape (number of rows and columns) of the
contingency table
• Cramers V, bound between zero and one does not suffer
from the above shortcoming (but strong associations may
translate in relatively low - below 0.5 – values
• Goodman and Kruskal's Lambda for strictly nominal
variables, compares predictions obtained for one of the
variables using two different methods, one which only
considers the marginal frequency distribution for that
variable, the other which picks up the most likely values after
considering the distribution of the other variable.
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
9
Other association measures
• Uncertainty coefficient as the Goodman and Kruskals
lambda, but considers the reduction in the prediction error
rather than the rate of correct predictions.
• Ordinal variables:
• Gamma statistic (between minus one and one, zero indicates
independence)
• Somers d statistic, adjustment of the Gamma statistics to account
for the direction of the relationshop
• Kendall’s Tau b and Tau c statistics, for square and rectangular
tables, respectively
• These statistics check all pairs of values assumed by the
two variables to see if (a) a category increase in one
variable leads to a category increase in the second one
(positive association); or (b) whether the opposite happens
(negative association); or (c) the ordering of one variable is
independent from the ordering of the other.
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
10
Directional vs. symmetric measures
• Directional measures (e.g. Somer’s d)
assume that the change in one variable
depends on the change in the other variable
(there is a direction)
• Symmetric measures assume no direction
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
11
Association measures in SPSS
Click here to see the list of available
statistics
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
12
Chi-square test in SPSS
In a typical week, what type of fresh or frozen chicken do you buy for your household's home
consumption? * Marital status Crosstabulation
Count
In a typical week, what
type of fresh or frozen
chicken do you buy for
your household's
home consumption?
Marital status
single
married
14
31
'Value' chicken
133
36
233
8
35
5
48
15
59
9
83
101
258
55
414
'Organic' chicken
Chi-square test
Chi-Square Tests
Pearson Chi-Square
Likelihood Ratio
Linear-by-Linear
Association
N of Valid Cases
Value
8.517a
8.739
1.120
6
6
Asymp. Sig.
(2-sided)
.203
.189
1
.290
df
414
a. 0 cells (.0%) have expected count less than 5. The
minimum expected count is 6.38.
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
Total
50
64
Total
other
5
'Standard' chicken
'Luxury' chicken
Contingency Table
As the p-value is above 0.05, the
hypothesis of independence cannot
be rejected at the 95% confidence
level
13
Other association measures (symmetric)
In a typical week, what type of fresh or frozen chicken do you buyfor your household's home consumption? * How would
you describe the financial situation of your household? Crosstabulation
Count
In a typical week, what
type of fresh or frozen
chicken do you buy for
your household's
home consumption?
'Value' chicken
How would you describe the financial situation of your
household?
Not very
well off
Difficult
Modest
Reasonable
Well off
8
7
20
11
4
Total
50
'Standard' chicken
14
20
75
81
40
230
'Organic' chicken
0
6
12
19
11
48
'Luxury' chicken
Total
1
7
21
40
14
83
23
40
128
151
69
411
Symmetric Measures
Nominal by
Nominal
Phi
Cramer's V
Contingency Coefficient
N of Valid Cases
Value
.268
.155
.259
411
Approx. Sig.
.003
.003
.003
a. Not assuming the null hypothesis.
b. Using the asymptotic standard error assuming the null
hypothesis.
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
14
Directional association measures
Directional Measures
Nominal by
Nominal
Lambda
Goodman and
Kruskal tau
Uncertainty Coefficient
Symmetric
In a typical week, what
type of fresh or frozen
chicken do you buy for
your household's home
consumption?
Dependent
How would you
describe the financial
situation of your
household? Dependent
In a typical week, what
type of fresh or frozen
chicken do you buy for
your household's home
consumption?
Dependent
How would you
describe the financial
situation of your
household? Dependent
Symmetric
In a typical week, what
type of fresh or frozen
chicken do you buy for
your household's home
consumption?
Dependent
How would you
describe the financial
situation of your
household? Dependent
Value
.020
Asymp.
a
Std. Error
.012
Approx. T
1.622
.000
.000
.
.
.035
.021
1.622
.105
.017
.006
.044
.016
.007
.010
.029
.009
3.104
.002e
.033
.010
3.104
.002
.026
.008
3.104
.002
b
c
Approx. Sig.
.105
c
d
d
e
e
a. Not assuming the null hypothesis.
b. Using the asymptotic standard error assuming the null hypothesis.
c. Cannot be computed because the asymptotic standard error equals zero.
Statistics for Marketing & Consumer Research
d. Based on chi-square approximation
Copyright © 2008 - Mario Mazzocchi
e. Likelihood ratio chi-square probability.
15
More than two variables:
three-way contingency tables
Government
Completely
distrust
Count
Female
Male
Total
UK
Italy
Germany
Netherlands
France
Total
UK
Italy
Germany
Netherlands
France
Total
8
9
5
2
11
35
1
7
0
0
9
17
52
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
2.00
Count
6
4
3
3
4
20
1
3
4
0
2
10
30
3.00
Count
11
9
3
4
6
33
1
3
1
1
4
10
43
Neither
Count
22
8
17
13
16
76
1
6
7
5
6
25
101
5.00
Count
12
8
14
12
13
59
1
11
4
6
6
28
87
6.00
Count
15
8
20
21
4
68
1
9
6
8
2
26
94
Completely
trust
Count
7
5
9
18
9
48
2
8
6
6
5
27
75
Total
Count
81
51
71
73
63
339
8
47
28
26
34
143
482
16
Log-linear analysis
• The objective of log-linear analysis is to
explore the association between more than
two categorical variables check whether
associations are significant and explore how
the variables are associated
• Log-linear analysis can be applied by
considering a general log-linear model
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
17
Log-linear analysis:
saturated model
• Consider the three variables of the three-way contingency table: T(in
government) Gender and Country
• The frequency of each cell in the table can be rewritten as:
nijk  q uGuCuTuGCuGTuCTuGCT
• where:
nijk
is the frequency for trust-level i, gender j and country k
uG
is the main effect of Gender (Trust, Country)
uGT
is the interaction effect of Gender and Trust
uGCT
is the interaction effect of Gender Trust and Country
q
is scale parameter which depends on the total number of obs.
and similarly for uT, uC, uGC,...
The frequency of each cell is fully explained when considering all of the
main and interaction effect (the model is saturated)
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
18
Interpretation of the model
nijk  q uGuCuTuGCuGTuCTuGCT
The u terms represent the main and interaction effects and can be
interpreted as the expected relative frequencies
• For example, in a two by two contingency table with no interaction,
one would have
nij=Nfi0f0j
• Instead, if there is dependence (relevant interaction), the
frequencies of a two by two contingency table are exactly explained
(this is in fact a saturated model) by
 f ij 
nij  Nf i 0 f 0 j 
 f f 
 i0 0 j 
where the term between brackets reflects the frequency explained
by the interaction term and is one under independence
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
19
The log-linear model
• By taking the logarithms one moves to a linear
rather than multiplicative form
log(nijk )  log(q )  log(uG ) + log(uC ) + log(uT ) + log(uGC ) + log(uGT ) + log(uCT ) + log(uGCT )
• The saturated model is not very useful, as it fits
the data perfectly and does not tell much about
the relevance of each of the effects
• Thus, log-linear analysis check whether simplified
log-linear models are as good as the saturated
model in predicting the table frequencies
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
20
1. What log-linear analysis does
1. Computes the main and interaction effects for
the saturated model
2. Simplifies the saturated model by deleting
(according to a given rule) some of the main and
interaction effects and obtains estimates for all
of the main and interaction effects left in the
simplified model
3. Compares the simplified model with the
benchmark model
•
•
If the simplified model performs well, it goes back to
No. 2 and proceeds with attempts for further
simplification
Otherwise it stops
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
21
Simplified models
• For example, suppose that the three-way interaction term
is omitted, the log-linear model becomes:
log(nijk )  log(q )  log(uG ) + log(uC ) + log(uT ) + log(uGC ) + log(uGT ) + log(uCT ) + ijt
• Now there is an error term
• The effects cannot be computed exactly, but can be
estimated through a regression-like model, where
• the dependent variable is the (logarithm of) cell frequency
• the explanatory variables are a set of dummy variables with value
one when a main effect or interaction is relevant to that cell of the
contingency table and zero otherwise
• the estimated coefficients are the (logarithm of) the corresponding
main or interaction effects
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
22
Hierarchical log-linear analysis
Proceeds hierarchically backward
• Delete the highest-order interaction first
(uGCT)
• Delete lower-order interactions (uGC, uCT ,
uGT), one by one, two together, three
altogether
• Delete main effects (uG, uT, uC), one by
one, two together, three altogether
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
23
Hierarchical LLA in SPSS
Select the
categorical
variables and
define their
range of values
Select backward elimination for hierarchical LLA
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
24
Options
Provides the (exact)
estimates for the
saturated model
Provides the association
table (useful for deleting
terms)
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
25
K-way effects:
Output
K=1 MAIN EFFECTS
K=2 2-WAY INTERACTION
K=3 3-WAY INTERACTION
The 3-way interaction
can be omitted, but other
effects seem to be
relevant
K-W ay and Higher-Order Effects
K-way and Higher
a
Order Effects
K-way Effects
b
K
1
2
3
1
2
3
df
69
58
24
11
34
24
Likelihood Ratio
Chi-Square
Sig.
277.095
.000
124.416
.000
17.155
.842
152.678
.000
107.262
.000
17.155
.842
a. Tests that k-way and higher order effects are zero.
b. Tests that k-way effects are zero.
Test the effect of deleting that kway order effect ONLY
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
Pearson
Chi-Square
Sig.
276.963
.000
115.059
.000
14.784
.927
161.904
.000
100.275
.000
14.784
.927
Number of
Iterations
0
2
4
0
0
0
Test the effect of deleting that k-way order
effect AND ALL EFFECTS OF AN
HIGHER ORDER
26
Deletion of terms
• Now it is possible to look within a given kway class (partial association table)
Partial Associations
Effect
q50*q64
q50*q43j
q64*q43j
q50
q64
q43j
df
4
6
24
1
4
6
Partial
Chi-Square
37.915
3.225
63.880
82.057
.752
69.869
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
Sig.
.000
.780
.000
.000
.945
.000
Number of
Iterations
2
2
2
2
2
2
Deletion of these
terms does not make
the prediction of the
contingency table
cells worse compared
to the model with the
term
27
Specification search
Step Summary
Step
0
b
Generating Class
c
Deleted Effect
1
Generating Class
1
c
Deleted Effect
2
Generating Class
1
2
3
c
Deleted Effect
3
Generating Class
1
2
c
a
Effects
q50*q64*
q43j
q50*q64*
q43j
q50*q64,
q50*q43j,
q64*q43j
q50*q64
q50*q43j
q64*q43j
q50*q64,
q64*q43j
q50*q64
Chi-Square
df
Number of
Iterations
Sig.
.000
0
.
17.155
24
.842
17.155
24
.842
37.915
3.225
63.880
4
6
24
.000
.780
.000
20.379
30
.906
39.036
4
q64*q43j
65.001
24
No more terms
can be 2
.000
eliminated
.000
2
q50*q64,
q64*q43j
20.379
30
.906
4
This term can be
eliminated
2
2
2
a. For 'Deleted Effect', this is the change in the Chi-Square after the effect is deleted from the model.
b. At each step, the effect with the largest significance level for the Likelihood Ratio Change is deleted, provided
the significance level is larger than .050.
c. Statistics are displayed for the best model at each step after step 0.
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
28
Further steps
• The hierarchical procedure stops when it
cannot eliminate all effects for a given
order
• However, the partial association table
showed that the main effect for country
might be non-relevant
• It may be desirable to test another model
where that main effect is eliminated
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
29
Further steps
Select the
variables here
Click here to define the model
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
30
The model
Specify the model
by inserting those
2-way interaction
terms retained
from hierarchical
analysis and
deleting the main
effect for q64
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
31
Output
Goodness-of-Fit Testsa,b
Likelihood Ratio
Pearson Chi-Square
Value
20.379
18.410
df
30
30
Sig.
.906
.952
The model is still
acceptable after
deleting the
country main
effect
a. Model: Poisson
b. Design: Constant + q43j + q50 + q43j * q64 + q50 * q64
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
32
And the winner is...
log(nijk )  log(q )  log(uG ) + log(uC ) + log(uT ) + log(uGC ) + log(uTC ) + ijt
• This model explains the contingency table
cells almost as well as the saturated (exact)
model
• Thus, (a) the interaction among country,
trust level and gender; and (b) the
interaction between trust level and gender
are not relevant
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
33
Parameter estimates
• These can be regarded as size effects (how
much are these terms relevant? comparisons
are allowed!) – check the Z statistic
• Check the SPSS output (click on OPTIONS to
ask for estimates)
• Odds-ratio (the ratio between estimates of
the Z for different cells) indicate the ratio
of the probabilities of ending up in a cell
compared to the one chosen as a benchmark
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
34
Odds-ratio example
• compare UK females (Z=3.97) with German females
(Z=1.02)
• The ratio is about four, which means that:
• the interaction between being female and from the UK
is about four times more important than the interaction
between being female and from Germany in explaining
departure from a flat distribution
• the effect is positive (it increase frequencies)
• This would suggest that in contingency tables it is
more likely to find UK females than German
females after accounting for all other effects
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
35
Canonical Correlation Analysis (CCA) (1)
• This technique allows one to explore the
relationship between a set of dependent
variables and a set of explanatory variables.
• Multiple regression analysis can be seen as a
special case of canonical correlation analysis
where there is a single dependent variable
• CCA is applicable to both metric and nonmetric variables.
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
36
CCA(2)
• Link with correlation analysis:
canonical correlation is the method which maximizes the
correlation between two sets of variables rather than
individual variables
Example: relation between attitudes towards chicken and
general food lifestyles in the Trust data-set
• Attitudes towards chicken are measured through a set of
variables which include taste, perceived safety, value for
money, safety, etc. (items in q12)
• Lifestyle measurement is based on agreement with
statements like “I purchase the best quality food I can
afford” or “I am afraid of things that I have never eaten
before” (items in q25)
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
37
Canonical variates and
canonical correlation
• CCA relates two sets of variables
• This technique also needs to combine variables
within each set to obtain two composite measures
which can be correlated
• In standard correlation analysis this synthesis
consists in a linear combination of the original
variables for each set leading to the estimation of
canonical variates or linear composites
• The bivariate correlation between the two
canonical variates is the canonical correlation
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
38
Canonical correlation equations
1)m dependent variables, y1, y2, …, ym
2)k independent variables, x1, x2, …, xk
The objective is to estimate several (say c) canonical variates as follows:
Y
S1  11 y1  12 y2  ...  1m ym
X
S1  11 x1  12 x2  ...  1k xk
Y
S 2   21 y1   22 y2  ...   2 m ym
X
S 2   21 x1   22 x2  ...   2 k xk
...
Y
Sc   c1 y1   c 2 y2  ...   cm ym
X
Sc   c1 x1   c 2 x2  ...   ck xk
the (canonical) correlation between the canonical variables YS1 and XS1 is
the highest, followed by the correlation between YS2 and XS2 and so on
Furthermore, the extracted canonical variates are not correlated between
each other, so that CORR(YSi, YSj)=0 and CORR(XSi, XSj)=0 for any i≠j,
which also implies CORR(YSi, XSj)=0.
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
39
Canonical functions
• The bivariate linear relationship between variates, YSi=f(XSi)
is the i-th canonical function
• The maximum number of canonical functions c (canonical
variates) is equal to m or k, whichever the smaller.
• CCA estimates the canonical coefficients  and  in a way
that they maximize the canonical correlation between the
two covariates
• The coefficients are usually normalized in a way that each
canonical variable has a variance of one.
• The method can be generalized to deal with partial
canonical correlation (controlling for other sets of
variables) and nonlinear canonical correlation (where the
canonical variates show a non-linear relationship).
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
40
Output elements
• Canonical loadings – linear correlations between each of
the original variable and their respective canonical variate
• Cross-loadings – correlations with the opposite canonical
variate.
• Eigenvalues (or canonical roots) – squared canonical
correlations, they represent how much of the original
variability is shared by the two canonical variables of each
canonical correlation
• Canonical scores – value of the canonical function for each
of the observations, based on the canonical variates
• Canonical redundancy index – it measures how much of the
variance in one of the canonical variates is explained by
the other canonical variate
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
41
Canonical correlation analysis
in SPSS
• There is no menu-driven routine for CCA
• A macro routine written through the
command (syntax) editor is necessary
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
42
Canonical correlation macro
Indicate here the path to the SPSS directory
Run the program
List the variables of the two sets
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
43
Output
Values of the canonical
correlation
The first 3 correlations are
different from 0 at a 95%
confidence level
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
44
Canonical coefficients for the 1st set of
canonical variates
Y
Set 1 ( coefficients)
S1  11 y1  12 y2  ...  1m ym
Canonical variate
1
2
3
4
5
6
7
8
9
10
q12a
-0.293
0.445
0.392
-0.261
-0.379
0.354
-0.226
0.048
-0.264
0.748
q12b
0.305
-0.362
-0.071
-0.591
0.237
0.267
-0.778
0.276
0.215
-0.25
q12c
-0.365
0.431
-0.267
-0.24
-0.187
0.209
-0.075
0.13
0.639
0.023
q12d
0.244
-0.211
-0.5
0.301
0.026
0.194
0.183
0.039
-0.201
0.026
q12e
0.217
-0.436
0.025
0.708
-0.453
0.398
0.257
-0.464
0.191
0.395
q12f
-0.407
-0.292
-0.005
-0.76
0.354
-0.041
0.706
-0.056
0.304
-0.704
q12g
-0.246
0.343
-0.049
-0.018
-0.452
-0.176
-0.363
-0.505
-0.957
-0.245
q12h
0.633
0.11
0.322
-0.083
0.406
-0.103
0.692
0.639
0.521
0.558
q12i
0.338
0.156
-0.211
-0.252
0.179
0.164
0.045
-0.837
0.072
0.437
q12j
0.349
0.469
0.286
0.099
-0.438
-0.2
-0.183
-0.211
0.313
-0.526
q12k
0.187
0.327
-0.039
0.079
0.23
0.721
0.306
0.117
-0.377
-0.346
q12l
-0.245
-0.23
0.607
0.278
0.344
0.055
-0.341
-0.141
0.165
0.071
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
45
Canonical coefficients for the 2nd set of
canonical variates
X
Set 2 ( coefficients)
S1  11x1  12 x2  ...  1k xk
Canonical variate
1
2
3
4
5
6
7
8
9
10
q25a
-0.55
0.283
-0.018
-0.215
-0.183
0.651
-0.759
0.116
0.431
0.526
q25b
0.14
0.222
0.038
0.135
-0.889
-0.084
-0.277
0.224
-0.533
0.59
q25c
0.323
0.103
-0.127
0.087
0.674
-0.465
-0.09
-0.536
-0.619
0.484
q25d
0.277
0.154
0.614
-0.315
-0.062
-0.258
0.212
-0.219
0.467
0.42
q25e
-0.142
-0.673
0.769
0.665
-0.089
-0.047
-0.487
0.15
-0.217
-0.417
q25f
0.295
0.682
-0.168
-0.425
-0.38
0.391
0.232
-0.258
-0.207
-0.796
q25g
0.322
0.057
-0.12
-0.438
0.39
-0.306
-0.979
0.058
0.421
-0.495
q25h
0.087
-0.504
-0.12
-0.725
0.022
0.326
0.083
0.315
-0.308
0.148
q25i
-0.175
0.329
0.444
0.135
0.476
0.284
0.092
0.766
-0.375
-0.032
q25j
0.5
-0.305
-0.115
0.428
0.025
0.847
0.206
-0.449
0.187
0.09
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
46
Loadings (correlations)
Set 1
q12a
q12b
q12c
q12d
q12e
q12f
q12g
q12h
q12i
q12j
q12k
q12l
1
-0.06
0.443
-0.354
0.373
0.118
-0.085
0.216
0.516
0.217
0.494
0.025
-0.175
2
-0.1
-0.404
0.496
-0.209
-0.499
-0.372
0.096
0.119
0.258
0.219
0.405
0.038
3
0.382
0.096
-0.363
-0.284
0.232
0.308
0.258
0.375
-0.217
0.362
0.069
0.592
4
-0.384
-0.512
0.034
0.08
-0.007
-0.528
-0.357
-0.281
-0.174
0.031
0.216
0.325
5
-0.564
-0.213
-0.018
-0.326
-0.604
-0.305
-0.277
-0.042
0.465
-0.494
0.41
0.506
6
0.371
0.339
0.262
0.265
0.404
0.168
-0.06
-0.09
-0.017
-0.026
0.672
0.211
7
0.073
-0.371
-0.049
0.188
0.24
0.502
0.167
0.372
-0.061
-0.011
0.078
-0.157
8
0.126
0.177
0.115
0.215
-0.143
-0.136
-0.013
0.294
-0.714
-0.134
0.03
-0.169
9
-0.174
0.001
0.505
-0.143
0.132
0
-0.452
-0.055
0.113
0.248
-0.18
0.115
10
0.305
-0.077
0.056
0.001
0.04
-0.234
-0.011
0.228
0.246
-0.432
-0.316
-0.037
1
-0.616
0.577
-0.108
0.286
-0.268
-0.1
0.678
0.038
0.305
0.719
2
0.164
0.148
0.114
0.109
-0.317
0.346
0.055
-0.56
0.395
-0.09
3
0.178
-0.126
0.153
0.77
0.624
0.305
-0.183
0.087
0.229
-0.113
4
-0.204
0.205
-0.119
-0.338
0.001
-0.328
-0.005
-0.7
0.184
0.367
5
0.046
-0.494
0.38
-0.048
-0.102
-0.16
0.128
0.009
0.395
0.089
6
0.402
-0.111
0.003
-0.074
0.09
0.271
-0.129
0.234
0.235
0.503
7
-0.372
-0.233
-0.221
0.154
-0.278
-0.029
-0.478
0.026
0.071
-0.02
8
-0.421
0.409
-0.621
-0.181
-0.397
-0.492
0.392
0.059
0.663
0.071
9
-0.038
-0.172
-0.567
0.283
-0.33
-0.382
0.271
-0.348
-0.081
0.205
10
0.197
0.284
0.193
0.233
-0.277
-0.426
-0.122
0.079
0.059
0.141
Set 2
q25a
q25b
q25c
q25d
q25e
q25f
q25g
q25h
q25i
q25j
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
47
Cross-loadings
Cross 1-2
q12a
q12b
q12c
q12d
q12e
q12f
q12g
q12h
q12i
q12j
q12k
q12l
1
-0.027
0.195
-0.156
0.164
0.052
-0.037
0.095
0.227
0.095
0.217
0.011
-0.077
2
-0.036
-0.145
0.178
-0.075
-0.179
-0.134
0.035
0.043
0.093
0.079
0.146
0.014
3
0.123
0.031
-0.116
-0.091
0.074
0.099
0.083
0.12
-0.07
0.116
0.022
0.19
4
-0.106
-0.142
0.01
0.022
-0.002
-0.146
-0.099
-0.078
-0.048
0.008
0.06
0.09
5
-0.131
-0.05
-0.004
-0.076
-0.14
-0.071
-0.064
-0.01
0.108
-0.115
0.095
0.117
6
0.078
0.071
0.055
0.056
0.085
0.035
-0.013
-0.019
-0.004
-0.005
0.141
0.044
7
0.011
-0.055
-0.007
0.028
0.036
0.075
0.025
0.055
-0.009
-0.002
0.012
-0.023
8
0.016
0.023
0.015
0.028
-0.019
-0.018
-0.002
0.038
-0.093
-0.017
0.004
-0.022
9
-0.014
0
0.041
-0.012
0.011
0
-0.037
-0.004
0.009
0.02
-0.015
0.009
10
0.012
-0.003
0.002
0
0.002
-0.009
0
0.009
0.01
-0.017
-0.012
-0.001
Cross 2-1
q25a
q25b
q25c
q25d
q25e
q25f
q25g
q25h
q25i
q25j
-0.271
0.253
-0.047
0.126
-0.118
-0.044
0.298
0.017
0.134
0.316
0.059
0.053
0.041
0.039
-0.114
0.124
0.02
-0.201
0.142
-0.032
0.057
-0.041
0.049
0.247
0.2
0.098
-0.059
0.028
0.073
-0.036
-0.056
0.057
-0.033
-0.093
0
-0.091
-0.001
-0.194
0.051
0.101
0.011
-0.115
0.088
-0.011
-0.024
-0.037
0.03
0.002
0.092
0.021
0.085
-0.023
0.001
-0.016
0.019
0.057
-0.027
0.049
0.049
0.106
-0.055
-0.035
-0.033
0.023
-0.041
-0.004
-0.071
0.004
0.011
-0.003
-0.055
0.053
-0.081
-0.024
-0.052
-0.064
0.051
0.008
0.086
0.009
-0.003
-0.014
-0.046
0.023
-0.027
-0.031
0.022
-0.028
-0.007
0.017
0.008
0.011
0.007
0.009
-0.011
-0.017
-0.005
0.003
0.002
0.005
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
48
Redundancy analysis
% OF VARIANCE EXPLAINED BY:
Set 1 by Set 1
Set 2 by Set 2
Set 1 by Set 2
Set 2 by Set 1
CV1-1
0.093
0.018
0.196
0.038
CV1-2
0.096
0.012
0.077
0.01
CV1-3
0.105
0.011
0.125
0.013
CV1-4
0.091
0.007
0.098
0.007
CV1-5
0.158
0.008
0.061
0.003
CV1-6
0.091
0.004
0.064
0.003
CV1-7
0.058
0.001
0.058
0.001
CV1-8
0.066
0.001
0.176
0.003
CV1-9
0.054
0
0.093
0.001
CV1-10
0.047
0
0.051
0
Statistics for Marketing & Consumer Research
Copyright © 2008 - Mario Mazzocchi
49

similar documents