Multifactor Experiments

Multifactor Experiments
November 26, 2013
Gui Citovsky, Julie Heymann, Jessica Sopp,
Jin Lee, Qi Fan, Hyunhwan Lee, Jinzhu Yu,
Lenny Horowitz, Shuvro Biswas
Outline
• Two-Factor Experiments with Fixed Crossed
Factors
• 2k Factorial Experiments
• Other Selected Types of Two-Factor Experiments
Two-Factor
Experiments with Fixed
Crossed Factors
First, single factor
• Comparison of two or more treatments (groups)
• Single treatment factor
• Example: A study to compare the average flight
distances for three types of golf balls differing in the
shape of dimples on them: circular, fat elliptical, thin
elliptical
• Treatments
circular, fat elliptical, thin elliptical
• Treatment factor
type of ball
Single factor continued
Two-Factor Experiments With Fixed
Crossed Factors
• Two fixed factors, A with a ≥ 2 levels and B with b ≥
2 levels
• ab treatment combinations
• If there are n observations obtained under each
treatment combination (n replicates), then there is a
total of abn experimental units
Two-Factor Experiments With Fixed
Crossed Factors
• Example: Heat treatment experiment to evaluate the
effects of a quenching medium (two levels: oil and
water) and quenching temperature (three levels: low,
medium, high) on the surface hardness of steel
• 2 x 3 = 6 treatment combinations
• If 3 steel samples are treated for each combination,
we have N = 18 observations
Model and Estimates of its Parameters
Let yijk=kth observation on the (i,j)th treatment
combination, i=1,2,…,a , j=1,2,…,b, and k=1,2,…,n.
Let random variable Yijk correspond to observed
outcome yijk.
Basic Model:
Yijk ~ N(mij , s 2 ) and independent
i.i.d
Yijk = mij + eijk where eijk ~ N(0, s 2 )
Table format
Parameters
ith Row Average:
Grand Mean:
åi=1å j=1 mij
a
m = m·· =
b
ab
å j=1 mij
jth Column Average:
b
mi· =
jth Column Main Effect:
b j = m· j - m··
b
m· j
å
=
ith Row Main Effect:
t i = mi· - m··
(i,j)th Row Column Interaction
(tb )ij = mij - m - t i - b j = mij - mi· - m· j + m··
a
m
i=1 ij
a
Least Squares Estimates
mˆ = y···
tˆi = yi·· - y···
bˆ j = y· j· - y···
= yij· - yi·· - y· j· + y···
mˆij = mˆ + tˆi + b j +
= y··· + (yi·· - y··· )+ (y· j· - y··· )+ (yij· - yi·· - y· j· + y··· ) = yij·
Variance
• Sample variance for (i, j)th cell is:
s
2
ij
å
=
n
k=1
(yijk - yij· )2
n -1
• Pooled estimate for σ2:
s =
2
å å
a
b
i=1
j=1
(n -1)sij2
N - ab
Example
• Experiment to study how mechanical bonding strength of capacitors
depends on the type of substrate (factor A) and bonding material
(factor B).
• 3 substrates: Al2O3 with bracket, Al2O3 no bracket, BeO no bracket
• 4 types of bonding material: Epoxy I, Epoxy II, Solder I and Solder II
• Four capacitors were tested at each factor level combination
Substrate
Al2O3 no
bracket
Al2O3 with
bracket
BeO
Epoxy I
1.51, 1.96,
1.83, 1.98
1.63, 1.80,
1.92, 1.71
3.04, 3.16,
3.09, 3.50
Bonding Material
Epoxy II
Solder I
2.62, 2.82,
2.96, 2.82,
2.69, 2.93
3.11, 3.11
3.12, 2.94,
2.91, 2.93,
3.23, 2.99
3.01, 2.93
1.91, 2.11,
3.04, 2.91,
1.78, 2.25
2.48, 2.83
Solder II
3.67, 3.40,
3.25, 2.90
3.48, 3.51,
3.24, 3.45
3.47, 3.42,
3.31, 3.76
Example continued
Substrate
Al2O3 no
bracket
Al2O3 with
bracket
BeO
Bonding Material
Epoxy II
Solder I
2.62, 2.82,
2.96, 2.82,
2.69, 2.93
3.11, 3.11
3.12, 2.94,
2.91, 2.93,
3.23, 2.99
3.01, 2.93
1.91, 2.11,
3.04, 2.91,
1.78, 2.25
2.48, 2.83
Epoxy I
1.51, 1.96,
1.83, 1.98
1.63, 1.80,
1.92, 1.71
3.04, 3.16,
3.09, 3.50
t 1 = y1·· - y··· = 2.723- 2.800 = -0.077
b1 = y·1· - y··· = 2.261- 2.800 = -0.539
Solder II
3.67, 3.40,
3.25, 2.90
3.48, 3.51,
3.24, 3.45
3.47, 3.42,
3.31, 3.76
= y11· - y1·· - y·1· + y··· =1.820 - 2.723- 2.261+ 2.800 = -0.364
s11 = 0.217 s12 = 0.138 s13 = 0.139 s14 = 0.321
s21 = 0.124 s22 = 0.131 s23 = 0.044 s24 = 0.122
s31 = 0.208 s32 = 0.209 s33 = 0.240 s34 = 0.192
Pooled sample variance: s =
2
(0.217)2 +
+ (0.192)2
12
= 0.0349
Example continued: Sample Means
Substrate
Al2O3 no
bracket
Al2O3 with
bracket
BeO
Column
mean
Epoxy I
1.820
Bonding Material
Epoxy II
Solder I
2.765
3.000
Solder II
3.305
Row
Mean
2.723
1.765
3.070
2.945
3.420
2.800
3.198
2.013
2.815
3.490
2.879
2.261
2.616
2.920
3.405
2.800
Example continued: Other Model
Parameters
Two- Way Analysis of Variance
We define the following sum of squares:
a
b
n
SST = åå å (yijk - y··· )2
i=1 j=1 k=1
a
b
n
a
a
SSA = å åå (yi·· - y··· ) = bnå (yi·· - y··· ) = bnåtˆi2
2
2
i=1 j=1 k=1
a
b
i=1
n
i=1
b
a
SSB = å åå (y· j· - y··· ) = anå (y· j· - y··· ) = anå bˆ 2j
2
2
i=1 j=1 k=1
a
b
j=1
i=1
n
a
b
SSAB = åå å (yij· - yi·· - y· j· + y··· ) = nå å (yij· - yi·· - y· j· + y··· )2
2
i=1 j=1 k=1
a
i=1 j=1
b
= nå å
i=1 j=1
a
b
n
a
b
n
2
SSE = åå å (yijk - yij· )2 = åå å eijk
i=1 j=1 k=1
i=1 j=1 k=1
Analysis of Variance
• Degrees of Freedom:
•
•
•
•
•
SST: N – 1
SSA: a – 1
SSB: b – 1
SSAB: (a – 1)(b – 1)
SSE: N – ab
• SST = SSA + SSB + SSAB + SSE.
• Similarly, the degrees of freedom also follow this
identity, i.e.
N -1= (a -1)+ (b -1)+ (a -1)(b -1)+ (N - ab)
Analysis of Variance
• Mean squares =

..
SSA
MSA =
a -1
SSAB
MSAB =
(a -1)(b -1)
SSB
MSB =
b -1
SSE
MSE =
= s2
N - ab
Hypothesis Test
We test three hypotheses:
H 0 A : t 1 = t 2 = ... = t a = 0 vs. H1A : Not all t i = 0
H 0 B : b1 = b2 = ... = ba = 0 vs. H1B :Not all ba = 0
H 0 AB : (tb )11 = (tb )12 = ... = (tb )ab = 0 vs. H 0 AB : Not all (tb )ij = 0
If all interaction terms are equal to zero, then the effect
of one factor on the mean response does not depend on
the level of the other factors.
When do we reject H0?
• Use F-statistics to test our hypotheses by taking the
ratio of the mean squares to the MSE.
Reject
H0A
Reject
H0B
Reject
H 0 AB
MSA
if FA =
> fa-1,N -ab,a
MSE
MSB
if FB =
> fb-1,N-ab,a
MSE
MSAB
if FAB =
> f(a-1)(b-1),N-ab,a
MSE
• We test the interaction hypothesis H0AB first.
Summary (Table 13.5)
Source of
Variation
(Source)
Sum of Squares
(SS)
a
Degrees of
Freedom
(d.f.)
Main Effects
A
SSA = bnåtˆi2
a–1
Main Effects
B
SSB = anå bˆ 2j
b–1
Interaction
AB
Error
i=1
a
a
i=1
b
SSAB = nåå
a
i=1 j-1
b n
SSE = ååå eijk2
(a – 1)(b – 1)
N – ab
i=1 j=1 k=1
Total
a
b
n
SST = ååå(yijk - y··· )2
i=1 j=1 k=1
N–1
Mean Square
(MS)
SSA
a -1
SSB
MSB =
b -1
MSA =
MSAB =
SSAB
(a -1)(b -1)
MSE =
SSE
N - ab
F
FA =
MSA
MSE
FB =
MSB
MSE
FAB =
MSAB
MSE
Example: Bonding Strength of
Capacitors
Data Capacitors;
input Bonding \$ Substrate \$ Strength
Datalines;
Epoxy1 Al203 1.51 Epoxy1 Al203 1.96
1.83 Epoxy1 Al203 1.98
…
;
@@;
Epoxy1 Al203
proc GLM plots=diagnostics data=Capacitors;
TITLE "Analysis of Bonding Strength of Capacitors";
CLASS Bonding Substrate;
Model Strength = Bonding | Substrate;
run;
Bonding Strength of
Capacitors ANOVA Table
• At α=0.05, we can reject H0B and H0AB but fail to reject H0A.
• The main effect of bonding material and the interaction between
the bonding material and the substrate are both significant.
• The main effect of substrate is not significant at our α.
Main Effects Plot
• Definition: A main effects plot is a line plot of the
row means of factor and A and the column means
of factor B.
Factor B Main Effects Plot
4
4
3
3
Mean Resonse
Mean Resonse
Factor A Effects Plot
2
1
0
Al2O3
Al2O3 + Brckt
Be0
2
1
0
Epoxy I
Epoxy II
Solder I
Solder II
Interaction Plot
Model Diagnostics with
Residual Plots
• Why do we look at residual plots?
• Is our constant variance assumption true?
• Is our normality assumption true?
k
2
Factorial
Experiments
2k Factorial Experiments
• 2k factorial experiments is a class of multifactor
experiments consists of design in which each factor
is studied at 2 levels.
• If there are k factors, then we have 2k treatment
combinations
• 2-factor and 3-factor experiments can be generalized
to >3-factor experiments
2
2
experiment
• 22 Experiment: experiment with factors A and B, each at
two levels.
ab = (A high, B high)
b = (A low, B high)
a = (A high, B low)
(1) = (A low, B low)
22 experiment cont’d
Assume a balanced design with n observations for each
treatment combinations, denote these observations by yij
Yij ~ N(µi, σ2)
i = (1), a, b, ab
j = 1, 2, … , n
22 experiment cont’d
• Main effect of factor A (): difference in the mean response
between the high level of A and the low level of A, averaged
over the levels of B
• Main effect of factor B (): difference in the mean response
between the high level of B and the low level of B, averaged
over the levels of A
• Interaction effect of AB (): difference between the mean
effect of A at the high level of B and at the low level of B
=
− +(− (1))
2
() =
=
− +(− (1))
2
− −(− (1))
2
22 experiment cont’d
The least square estimates of the main effects and the
interaction effects are obtained by replacing the treatment
means by the corresponding cell sample means.
Est. Main Effect A = =
− +(− (1))
2
Est. Main Effect B = =
−  +(− (1))
2
Est. Interaction AB =  =
− −(−(1))
2
22 experiment cont’d
Contrast Coefficients for Effects in a 22 Experiment
Treatment
Effect
combinati
I
A
B
AB
on
(1)
+
+
a
+
+
b
+
+
ab
+
+
+
+
*Notice that the term-by-term products of any two
contrast vectors equal the third one
23 experiment
• 23 Experiment: experiment with factors A, B, and C
with n observations.
Yij ~ N(µi, σ2),
i = (1), a, b, ab, c, ac, bc, abc
Est. Main Effect A =
Est. Main Effect B =
Est. Main Effect C =
j = 1, 2, … , n.
− + −  +(− )+(− (1))
4
− + − +(− )+(− (1))
4
−  + −  +(−)+( −(1))
4
3
2
experiment cont’d
Est. Interaction Effect AB =
− − − +{ −  − − 1 }
4
Est. Interaction Effect BC =
− − − +{ − − − 1 }
4
Est. Interaction Effect AC =
− − − +{ − − − 1 }
4
Est. Interaction Effect ABC =
− − − −{ −  − − 1 }
4
23 experiment cont’d
Contrast coefficients for Effects in a 23 Experiment
Treatment
Combination
Effect
I
A
B
AB
C
AC
BC
ABC
(1)
+
-
-
+
-
+
+
-
a
+
+
-
-
-
-
+
+
b
+
-
+
-
-
+
-
+
ab
+
+
+
+
-
-
-
-
c
+
-
-
+
+
-
-
+
ac
+
+
-
-
+
+
-
-
bc
+
-
+
-
+
-
+
-
abc
+
+
+
+
+
+
+
+
23 experiment example
Factors affecting bicycle performance:
Seat height (Factor A): 26" (-), 30" (+)
Generator (Factor B):
Off (-), On(+)
Tire Pressure (Factor C): 40 psi (-), 55 psi (+)
23 experiment example cont’d
Travel times from Bicycle Experiment
Factor
Time (Secs.)
A
B
C
Run 1
Run 2
Mean
-
-
-
51
54
52.5
+
-
-
41
43
42.0
-
+
-
54
60
57.0
+
+
-
44
43
43.5
-
-
+
50
48
49.0
+
-
+
39
39
39.0
-
+
+
53
51
52.0
+
+
+
41
44
42.5
23 experiment example cont’d
42.5−52.0 + 39.0−49.0 + 43.5−57.0 +(42.0−52.5)
= -10.875
4
42.5−39.0 + 52.0−49.0 + 43.5−42.0 +(57.0−52.5)
significant
B=
= 3.125
4
42.5−43.5 + 52.0−57.0 + 39.0−42.0 +(49.0−52.5)
C=
= -3.125
4
42.5−52.0 − 39.0−49.0 +{ 43.5−57.0 − 42.0−52.5 }
AB =
= -0.625
4
42.5−52.0 − 43.5−57.0 +{ 39.0−49.0 − 42.0−52.5 }
AC =
= 1.125
4
42.5−39.0 − 43.5−42.0 +{ 52.0−49.0 − 57.0−52.5 }
BC =
= 0.125
4
42.5−52.0 − 39.0−49.0 −{ 43.5−57.0 − 42.0−52.5 }
ABC =
= 0.875
4
A=
2k experiment
• 2k experiments, where k>3.
• n iid observations yij (j = 1,2,…n) at the ith treatment
combination and its sample mean yi (i = 1,2,…, 2k)
has the following estimated effect.
Est. Effect =
2
=1
2 − 1
Statistical Inference for 2k Experiments
Basic Notations and Derivations
• .  =
2
=1
2−1
•  .  =
( 2 /)2
22−2
=
2
2−2
•  .  =
2
2
=1  ( )
(2−1 )2

2−2
=
2
=1
±1 2 ( 2 /)
(2−1 )2
=
=  2
=
2
=1

=1( −
2 ( − 1)
d.f.  = 2 ( − 1)
)2
CI and Hypotheses Test with t Test
• Therefore a CI for any population effect is given by

.  ± ,/2
2−2
• The t-statistic for testing the significance of any
estimated effect is
(. )
2−2 (. )
=
=
(. )

Hypotheses Test with F Test
• Equivalently, we can use F test to do it
(2−2 )(. )2
2
=  =
2
• The estimated effect is significant at level  if
2
> ,/2 ⟺  > , = 1,,/2
2
Sums of Squares for Effects
•  =
(2−2 )(.)2
2
=

• ⇒  =  = (2−2 )(. )2
•  =
2
=1( − )2 = 1 + 2 + ⋯ + 2 −1
The effects are mutually orthogonal contrasts.
Regression Approach to 2k Experiments
• a 22 experiment
−1,
1 =
+1,
−1,
2 =
+1,

ℎℎ

ℎℎ
• Multiple regression model
= 0 + 1 1 + 2 2 + 12 1 2
• 0 = , 1 =

,
2
2 =

2
and 12 =

2
Regression Approach to 2k Experiments
• 23 experiment

= 0 + 1 1 + 2 2 + 3 3 + 12 1 2 + 13 1 3
+ 23 2 3 + 123 1 2 3

1 + 2 + 3
2
2
2

2 3 +
1 2 3
2
2
• = +
+

2 1 2
+

2 1 3
+
• If all interactions are dropped from the model, the

new fitted model is  =  + 1 + 2 + 3
2
2
2
Regression Approach to 2k Experiments
• The interpolation formula
−
=
/2
=
−(ℎ + )/2
(ℎ −)/2
A(seat height)= -10.875
B(generator) = 3.125
C(tire pressure) = -3.125
= 47.1875
Bicycle Example:
Main Effects Model
• main effects model  =  +

2 1
+

2 2
+

2 3
10.875
3.125
3.125
= 47.1875 −
1 +
2 +
3
2
2
2
• minimum travel time
= 47.1875 − 5.4375 +1 + 1.5625 −1 − 1.5625 +1
= 38.625
Bicycle Example:
Main Effects Model
Sums of squares for
omitted interactions
effects
=  +  +  +
= 1.56+5.0625+0.0625+4.1875 = 10.875
d.f. = 4
Pure SSE = 33.5
d.f = 8
pooled SSE = 33.5 + 10.875
d.f. = 12 (total)
MSE =

..
=
44.375
=
12
3.698
Bicycle Example:
Residual Diagnostics
To check model assumptions
Residuals  =  −
• Normality
• Equal error variance
proc glm plots=diagnostics data = biker;
class A B C;
model travel= A|B|C;
run;
Single Replicated Case
• Unreplicated case: n =1
• Problems in statistical testing
Unusual
response? Noise?
Spoiling the
results?
• 0 degrees of freedom for error,
• cannot use formal tests and C.I. to estimate of error and assess effects
• Potential solutions
• Pooling high-order interactions to estimate error
• Graphical approach: normal plot against effects
• Estimated effects
• Independent, orthogonal, normally distributed, common variance
2
( −2 )
2
Single Replicated Case
• Effect Sparsity principle
• If number of effects is large (e.g. k= 4, 15 effects), a majority of them are
small ~N (0,σ2), few a large and more influential ~ (u≠0, σ2)
• Reduced model
• retaining only significant effects, omitting non-significant ones
• Obtain sums of squares for omitted effects => pooled error sum of
squares (SSE) (Error due to ignoring negligible effects)
• Error d.f. = # pooled omitted effects
• MSE = SSE/error d.f.
• Perform formal statistical inferences
Other Types of TwoFactor Experiments
Section 13.3
Two-Factor Experiments with
(Crossed and) Mixed Factors
• A is fixed factor with a levels
• B is random factor with b levels
• Assume a balanced design with n ≥ 2 obs’s at each
of (a x b) treatment combinations
Example:
• Compare three testing laboratories
• Material tested comes in batches
• Several samples from each batch tested in each laboratory
• Laboratories represent a fixed factor
• Batches represent a random factor
• Two factors are crossed, since samples are tested from
each batch in each laboratory
• Model?
Mixed Effects Model
• Yijk = µ + τi + ßj + (τß)ij + Єijk
 µ,τi are fixed parameters
 ßj, (τß)ij are random parameters
 Єijk i.i.d. N(0,σ2) random errors
The (Probability) Distribution of the
Random Effects
 The random βj are the main effects of B, which are
assumed to be i.i.d. N(0, σβ2) where σβ2 is called the
variance component of the B (random factor) main
effect. The distribution of βj would therefore be
f βj (x) =
1
√2Пσβ2
exp(-x2/2σβ2 )
Variance Components Model
•  =  +  +  + () +
2
•   = 2 = 2 +
+ 2
• Variance Components Model
• SST = SSA + SSB +SSAB +SSE
(same as fixed-effects model)
Expected Mean Squares
• E(MSA) = σ2 + nσ2AB + nΣia τi2 /(a-1)
• E(MSB) = σ2 + nσ2AB + anσ2B
• E(MSAB) = σ2 + nσ2AB
• E(MSE) = σ2
Unbiased estimators of variance components
• 2 = MSE
•  2AB = (MSAB -  2 )/n
•  2B = (MSB -  2 - n  2AB) /an
Common tests
• H0A: τ1 = … =τa = 0
vs.
H1A: At least one τi ≠ 0
• H0B: σ2B = 0
vs.
H1B: σ2B > 0
• H0AB: σ2AB = 0
vs.
H1AB: σ2AB > 0
Common tests: results
• Reject H0A if FA = MSA/MSAB > fa-1,(a-1)(b-1),α
• Reject H0B if FB = MSB/MSAB > fb-1,(a-1)(b-1),α
• Reject H0AB if FAB = MSAB/MSE > f(a-1)(b-1),v,α
Two-Factor Experiments w. Nested and
Mixed Factors
• Model:
• Where,
Two-Factor Experiments w. Nested and
Mixed Factors
• Orthogonal Decomposition of Sum of Squares
Two-Factor Experiments w. Nested and
Mixed Factors
• ANOVA Table
Illustrative Example
• Consider the Following Experiment:
~ A  Concentration of Reactant
~ B  Concentration of Catalyst
Analysis with SAS
• Code
Analysis with SAS
• Selected Output
Summary
• Two factor experiments with multiple levels
• Model:
• We can decompose the Sum of Squares as:
• And compute test statistics under Ho, as:
Summary
• 2^k Factorial Experiments
• k factors, 2 levels each
• Calculate the Sum of Squares due to an effect as
Acknowledgements
• Tamhane, Ajit C., and Dorothy D. Dunlop. "Analysis of
Multifactor Experiments." Statistics and Data Analysis: From
Elementary to Intermediate. Upper Saddle River, NJ: Prentice
Hall, 2000.
• Cody, Ronald P., and Jeffrey K. Smith. "Analysis of Variances:
Two Independent Variables." Applied Statistics and the SAS
Programming Language. 5th ed. Upper Saddle River, NJ:
Prentice Hall, 2006.
• Prof. Wei Zhu
• Previous Presentations