```Lecture 5
crossed random effects
models”
Clustered versus non clustered random effects
(Chap 11, new edition)
• We have discussed higher-level hierarchical
models where units are classified by some
factors (for example schools) into top level
clusters at level L.
• The units in each top level cluster are then
(sub)classified by a further factor (for
example class) into clusters at level L-1.
• The factors defining the classifications are
nested in the same sense that a lower-level
cluster can only belong to one higher level
cluster (for example a class can only belong
to one school)
Non hierarchical models but
random effects models
• We now discuss non hierarchical
models where units are cross-classified
by two or more factors, with each unit
potentially belonging to any combination
of levels of the different factors
Non Hierarchical Models
• So far, we have treated occasions nested
within individuals
• However, if all individuals are affected
similarly by some events or characteristics
associated with the occasions, such as
weather conditions, strikes, new legislation
etc.. It seems reasonable to treat occasions
as crossed with individuals, or to consider a
“main effect” of time.
Non hierarchical models
• Factors are not always completely
crossed. For example, the high schools
and elementary schools attended by
students are not clustered, but there are
many combinations of high school and
elementary school that do not occur in
practice, perhaps because the schools
are in different geographical regions.
A psychological experiment with two
potentially interacting factors (Gelman,
sec 13.5)
• Let y jk denotes the success rate of a
pilots training on a flight simulator
(j=1,2,3,4,5) in airport (k=1,….,8).
 • These 40 data points have two
groupings - treatments and airports which are not nested
Non nested random effects model
Treatment random effects
y
jk
Airport random effects
    j   k   jk
 j ~ N (0,   )
2
 k ~ N (0,   )
2

 jk ~ N (0,   )
2

j  1,2, 3, 4,5
k  1,..., 8
Estimates of the variance
components
ˆ   0.23

•
ˆ   0.04

•
2
2
ˆ   0.32

2
The variance of the success rates
is huge among airports - even
larger than among the individual
measurements.
Whereas there is almost no
differences across treatments
How much do primary and secondary
schools afflict attainment at age 16?
• Data are cross-classified by 148 primary schools
(elementary schools) and 19 secondary schools
(middle/high schools) (fife.dta)
• attain: attainment score at age 16 ( y ijk )
• pid: identifier for primary school (up to age 12) ( k )
• sid: identifier for secondary school (from age 12) ( j )
• vrq: verbal reasoning score
 from test taken in the last
year of primary school
• sex: gender (1:female; 0:male)


Data characteristics
• First, not every combination of primary and
secondary school exists.
• Second, many combinations of primary and
secondary schools occur multiple times
• For instance, students that attend elementary school
1 ended up in 3 secondary schools (1,9,18)
• There are at most 6 secondary schools per primary
schools, and for 90% of the primary schools there are
at most 3 secondary schools per primary school
• There are between 7 and 32 primary schools per
secondary school, the median being between 13 and
14
Attainment score at age 16 for
model
student i who went
to secondary school j and primary
school k
Average score
Random effects
y ijk   1   1 j   2 k   ijk
 1 j ~ N (0,  1 )
 2 k ~ N (0,  2 )
Variance across secondary schools
Variance across primary schools
 ijk ~ N (0,  )
Residual
variance
Estimation using xtmixed
• The estimated standard deviation of the primary
school random effect (  2 ) is 1.06, which is
considerably larger than the estimated standard
deviation of the secondary school random effect,
given by 0.59 (  1 )

• Therefore elementary schools appear to be more
variable in their effects than secondary schools.
However
 neither of these estimates are precise
• The standard deviation of the (  ijk) is estimated as
2.85(  ). This number reflects any interactions
between primary and secondary schools from the
means implied by the additive effects and variability
within groups of children
 belonging to the same
combination of primary and secondary school

Including a random interaction
• For many combinations of primary and secondary
school, we have several observations because more
than one child attended that combination of schools
y ijk   1   1 j   2 k   3 jk   ijk
 1 j ~ N (0,  1 )
 2 k ~ N (0,  2 )
 3 jk ~ N (0,  3 )
 ijk ~ N (0,  )
The random interaction term
• The interaction term takes on a different value
for each combination of secondary and
primary school to allow the assumption of
additive random effects to be relaxed.
• For example, some secondary schools might
be more beneficial to children who attended
particular elementary schools, perhaps
because of similar instructional practices
• We could not include interaction terms in the
pilot example, because there we have only
one observation for each treatment, airport
combination
Intraclass correlations
 ( primary )  cor ( y ijk , y i' j ' k ) 
2
IC among children for the
same primary schools but
different secondary schools
1   2   3  
 (sec ondary )  cor ( y ijk , y i' jk ' ) 
1
IC among children for the
same secondary schools
but different primary schools
1   2   3  
IC among children for the
same primary and
secondary schools
 ( primary ,sec ondary )  cor ( y ijk , y i' jk ) 
1   2   3
1   2   3  
 ( primary | sec ondary )  cor ( y ijk , y i' jk | j ) 
Given the secondary school, this
denotes the IC correlation among
children that had the same primary
school
2  3
2  3  
Diagnostics
• We can obtain the empirical Bayes
estimates of both primary and
secondary school random effects. If the
model is correct, there EB estimates
should have a normal distribution
• We assess the normality of the EB
estimates using a QQ plot
```