### Chapter 6 The 2k Factorial Design

```Chapter 6 The 2k Factorial Design
1
6.1 Introduction
• The special cases of the general factorial design
(Chapter 5)
• k factors and each factor has only two levels
• Levels:
– quantitative (temperature, pressure,…), or
qualitative (machine, operator,…)
– High and low
– Each replicate has 2    2 = 2k observations
2
• Assumptions: (1) the factor is fixed, (2) the design
is completely randomized and (3) the usual
normality assumptions are satisfied
• Wildly used in factor screening experiments
3
6.2 The 22 Factorial Design
• Two factors, A and B, and each factor has two
levels, low and high.
• Example: the concentration of reactant v.s. the
amount of the catalyst (Page 219)
4
• “-” And “+” denote the low
and high levels of a factor,
respectively
• Low and high are arbitrary
terms
• Geometrically, the four runs
form the corners of a square
• Factors can be quantitative or
qualitative, although their
treatment in the final model
will be different
5
• Average effect of a factor = the change in response
produced by a change in the level of that factor
averaged over the levels if the other factors.
• (1), a, b and ab: the total of n replicates taken at
the treatment combination.
• The main effects:
A
1
{[ ab  b ]  [ a  (1)]} 
2n


2n
B 
b  (1)
2n
 y A  y A
{[ ab  a ]  [ b  (1)]} 
2n

[ ab  a  b  (1)]
2n
ab  a
1
1
ab  b
2n
1
[ ab  b  a  (1)]
2n

a  (1)
2n
 y B  y B
6
• The interaction effect:
AB 
1
{[ ab  b ]  [ a  (1)]} 
2n

1
[ ab  (1)  a  b ]
2n
ab  (1)

ba
2n
2n
• In that example, A = 8.33, B = -5.00 and AB =
1.67
• Analysis of Variance
• The total effects:
Contrast
A
 ab  a  b  (1)
Contrast
B
 ab  b  a  (1)
Contrast
AB
 ab  (1)  a  b
7
• Sum of squares:
SS
SS
SS
A
B
AB

[ ab  a  b  (1)]
4n

[ ab  b  a  (1)]

[ ab  (1)  b  a ]
2
4n
SS T 
2
2
n

i 1
E
2
4n
2
SS
2
y ijk 
2
j 1 k 1
 SS T  SS
y 
A
4n
 SS
B
 SS
AB
8
Response:Conversion
ANOVA for Selected Factorial Model
Analysis of variance table [Partial sum of squares]
Sum of
Source
Squares
Model
291.67
A
208.33
B
75.00
AB
8.33
Pure Error 31.33
Cor Total 323.00
DF
3
1
1
1
8
11
Mean
Square
97.22
208.33
75.00
8.33
3.92
F
Value
24.82
53.19
19.15
2.13
Prob > F
0.0002
< 0.0001
0.0024
0.1828
Std. Dev.
Mean
C.V.
1.98
27.50
7.20
R-Squared
Adj R-Squared
Pred R-Squared
0.9030
0.8666
0.7817
PRESS
70.50
Adeq Precision
11.669
The F-test for the “model” source is testing the significance of the
overall model; that is, is either A, B, or AB or some combination of
these effects important?
9
• Table of plus and minus signs:
(1)
I
+
A
–
B
–
AB
+
a
b
ab
+
+
+
+
–
+
–
+
+
–
–
+
10
• The regression model:
y   0   1 x1   2 x 2  
– x1 and x2 are coded variables that represent the
two factors, i.e. x1 (or x2) only take values on –
1 and 1.
– Use least square method to get the estimations
of the coefficients
– For that example,
 8 . 33 
  5 . 00 
yˆ  27 . 5  
 x1  
x2
 2 
 2 
– Model adequacy: residuals (Pages 224~225)
and normal probability plot (Figure 6.2)
11
• Response surface plot:
yˆ  18 . 33  0 . 8333 Conc  5 . 00 Catalyst
– Figure 6.3
12
6.3 The 23 Design
• Three factors, A, B and C, and each factor has two
levels. (Figure 6.4 (a))
• Design matrix (Figure 6.4 (b))
• (1), a, b, ab, c, ac, bc, abc
• 7 degree of freedom: main effect = 1, and
interaction = 1
13
14
• Estimate main effect:
A 
1
[ a  (1)  ab  b  ac  c  abc  bc ]
4n
 y A  y A

a  ab  ac  abc

(1)  b  c  bc
4n

1
4n
[ a  ab  ac  abc  (1)  b  c  bc ]
4n
• Estimate two-factor interaction: the difference
between the average A effects at the two levels of
1
B
AB 
[ abc  bc  ab  b  ac  c  a  (1)]
4n

abc  ab  c  (1)
4n

bc  b  ac  a
4n
15
• Three-factor interaction:
ABC 
1
{[ abc  bc ]  [ ac  c ]  [ ab  b ]  [ a  (1)]}
4n

1
[ abc  bc  ac  c  ab  b  a  (1)]
4n
• Contrast: Table 6.3
– Equal number of plus and minus
– The inner product of any two columns = 0
– I is an identity element
– The product of any two columns yields another
column
– Orthogonal design
• Sum of squares: SS = (Contrast)2/8n
16
Table of – and + Signs for the 23 Factorial Design (pg. 231)
Factorial Effect
Treatment
Combination
I
A
B
AB
C
AC
BC
ABC
(1)
a
+
–
–
+
–
+
+
–
+
+
–
–
–
–
+
+
b
+
–
+
–
–
+
–
+
ab
+
+
+
+
–
–
–
–
c
+
–
–
+
+
–
–
+
ac
+
+
–
–
+
+
–
–
bc
+
–
+
–
+
–
+
–
abc
+
+
+
+
+
+
+
+
Contrast
24
18
6
14
2
4
4
Effect
3.00
2.25
0.75
1.75
0.25
0.50
0.50
17
• Example 6.1
A = carbonation, B = pressure, C = speed, y = fill deviation
18
• Estimation of Factor Effects
Model
Error
Error
Error
Error
Error
Error
Error
Error
Error
Term
Effect
Intercept
A
3
B
2.25
C
1.75
AB
0.75
AC
0.25
BC
0.5
ABC
0.5
LOF
0
P Error
SumSqr % Contribution
Lenth's ME
Lenth's SME
1.25382
1.88156
36
20.25
12.25
2.25
0.25
1
1
46.1538
25.9615
15.7051
2.88462
0.320513
1.28205
1.28205
5
6.41026
19
• ANOVA Summary – Full Model
Response:Fill-deviation
ANOVA for Selected Factorial Model
Analysis of variance table [Partial sum of squares]
Sum of
Source
Squares
Model
73.00
A
36.00
B
20.25
C
12.25
AB
2.25
AC
0.25
BC
1.00
ABC
1.00
Pure Error 5.00
Cor Total 78.00
DF
7
1
1
1
1
1
1
1
8
15
Mean
Square
10.43
36.00
20.25
12.25
2.25
0.25
1.00
1.00
0.63
F
Value
16.69
57.60
32.40
19.60
3.60
0.40
1.60
1.60
Prob > F
0.0003
< 0.0001
0.0005
0.0022
0.0943
0.5447
0.2415
0.2415
Std. Dev.
Mean
C.V.
0.79
1.00
79.06
R-Squared 0.9359
Adj R-Squared
Pred R-Squared
0.8798
0.7436
PRESS
20.00
Adeq Precision
13.416
20
• The regression model and response surface:
– The regression model:
 3 . 00 
 2 . 25 
 1 . 75 
 0 . 75 
yˆ  1 . 00  
 x1  
x2  
 x3  
 x1 x 2
 2 
 2 
 2 
 2 
– Response surface and contour plot (Figure 6.7)
Coefficient
Factor
Estimate
Intercept
1.00
A-Carbonation 1.50
B-Pressure
1.13
C-Speed
0.88
AB
0.38
Standard 95% CI 95% CI
DF
Error
Low
High
1
0.20
0.55
1.45
1
0.20
1.05
1.95
1
0.20
0.68
1.57
1
0.20
0.43
1.32
1
0.20
-0.072 0.82
21
• Contour & Response Surface Plots – Speed at the
High Level
DESIGN-EXPERT Plot
DESIGN-EXPERT Plot
Fill-deviation
Carbonation
2
Fill-deviationXY == A:
B: Pressure
2
30.00
Fill-deviation
X = A: Carbonation
Y = B: Pressure
Actual Factor
C: Speed = 250.00
4.875
3.5625
Design Points
28.75
2.25
3.125
Actual Factor
C: Speed = 250.00
Fill-deviation
B: Pres s ure
0.9375
-0.375
2.25
27.50
1.375
0.5
26.25
30.00
12.00
28.75
11.50
27.50
2
25.00
10.00
11.00
2
B: Pressure 26.25
10.50
11.00
11.50
10.50
A: Carbonation
12.00
25.00
10.00
A: Carbonation
22
• Refine Model – Remove Nonsignificant Factors
Response:
Fill-deviation
ANOVA for Selected Factorial Model
Analysis of variance table [Partial sum of squares]
Sum of
Source Squares
Model
70.75
A
36.00
B
20.25
C
12.25
AB
2.25
Residual 7.25
LOF
2.25
Pure E 5.00
C Total 78.00
DF
4
1
1
1
1
11
3
8
15
Mean
Square
17.69
36.00
20.25
12.25
2.25
0.66
0.75
0.63
F
Value
26.84
54.62
30.72
18.59
3.41
Prob > F
< 0.0001
< 0.0001
0.0002
0.0012
0.0917
1.20
0.3700
Std. Dev. 0.81
Mean
1.00
C.V.
81.18
R-Squared
Adj R-Squared
Pred R-Squared
0.9071
0.8733
0.8033
PRESS
Adeq Precision
15.424
15.34
23
6.4 The General 2k Design
• k factors and each factor has two levels
• Interactions
• The standard order for a 24 design: (1), a, b, ab, c,
ac, bc, abc, d, ad, bd, abd, cd, acd, bcd, abcd
k 
  tw o-factor interactions
2
k 
  three-factor interactions
3
1 k  factor interaction
24
• The general approach for the statistical analysis:
– Estimate factor effects
– Form initial model (full model)
– Perform analysis of variance (Table 6.9)
– Refine the model
– Analyze residual
– Interpret results
• Contrast
 ( a  1)( b  1)  ( k  1)
ABC ... K
2
ABC  K 
n2
SS
ABC  K

1
n2
k
k
Contrast
( Contrast
ABC  K
ABC  K
)
2
25
6.5 A Single Replicate of the 2k
Design
• These are 2k factorial designs
with one observation at each
corner of the “cube”
• An unreplicated 2k factorial
design is also sometimes called a
“single replicate” of the 2k
• If the factors are spaced too
closely, it increases the chances
that the noise will overwhelm
the signal in the data
26
• Lack of replication causes potential problems in
statistical testing
– Replication admits an estimate of “pure error”
(a better phrase is an internal estimate of
error)
– With no replication, fitting the full model
results in zero degrees of freedom for error
• Potential solutions to this problem
– Pooling high-order interactions to estimate
error (sparsity of effects principle)
– Normal probability plotting of effects
(Daniels, 1959)
27
• Example 6.2 (A single replicate of the 24 design)
– A 24 factorial was used to investigate the effects
of four factors on the filtration rate of a resin
– The factors are A = temperature, B = pressure,
C = concentration of formaldehyde, D= stirring
rate
28
29
• Estimates of the effects
Model
Error
Error
Error
Error
Error
Error
Error
Error
Error
Error
Error
Error
Error
Error
Error
Term
Intercept
A
B
C
D
AB
AC
AD
BC
BD
CD
ABC
ABD
ACD
BCD
ABCD
Effect
SumSqr % Contribution
21.625
3.125
9.875
14.625
0.125
-18.125
16.625
2.375
-0.375
-1.125
1.875
4.125
-1.625
-2.625
1.375
1870.56
39.0625
390.062
855.563
0.0625
1314.06
1105.56
22.5625
0.5625
5.0625
14.0625
68.0625
10.5625
27.5625
7.5625
Lenth's ME
Lenth's SME
32.6397
0.681608
6.80626
14.9288
0.00109057
22.9293
19.2911
0.393696
0.00981515
0.0883363
0.245379
1.18763
0.184307
0.480942
0.131959
6.74778
13.699
30
• The normal probability plot of the effects
DESIGN-EXPERT Plot
Filtration Rate
Temperature
Pressure
Concentration
Stirring Rate
99
A
95
90
Norm al % probability
A:
B:
C:
D:
Normal plot
AD
80
C
70
D
50
30
20
10
5
AC
1
-18.12
-8.19
1.75
Effect
11.69
21.62
31
DESIGN-EXPERT Plot
Interaction Graph
Filtration Rate
DESIGN-EXPERT Plot
Interaction Graph
C: Concentration
104
Filtration Rate
X = A: Temperature
Y = C: Concentration
D: Stirring Rate
104
X = A: Temperature
Y = D: Stirring Rate
88.4426
D- -1.000
D+ 1.000
Actual Factors
B: Pressure = 0.00
C: Concentration = 0.00
Filtration Rate
Filtration Rate
C- -1.000
C+ 1.000
Actual Factors
B: Pressure = 0.00
D: Stirring Rate = 0.00
72.8851
88.75
73.5
57.3277
58.25
41.7702
43
-1.00
-0.50
0.00
0.50
A: Tem perature
1.00
-1.00
-0.50
0.00
0.50
1.00
A: Tem perature
32
• B is not significant and all interactions
involving B are negligible
• Design projection: 24 design => 23 design in
A,C and D
• ANOVA table (Table 6.13)
33
Response:Filtration Rate
ANOVA for Selected Factorial Model
Analysis of variance table [Partial sum of squares]
Source
Model
A
C
D
AC
AD
Residual
Cor Total
Sum of
Squares
5535.81
1870.56
390.06
855.56
1314.06
1105.56
195.12
5730.94
Std. Dev.
Mean
C.V.
4.42
70.06
6.30
R-Squared 0.9660
Adj R-Squared
Pred R-Squared
0.9489
0.9128
PRESS
499.52
Adeq Precision
20.841
DF
5
1
1
1
1
1
10
15
Mean
Square
1107.16
1870.56
390.06
855.56
1314.06
1105.56
19.51
F
Value
56.74
95.86
19.99
43.85
67.34
56.66
Prob >F
< 0.0001
< 0.0001
0.0012
< 0.0001
< 0.0001
< 0.0001
34
• The regression model:
Final Equation in Terms of Coded Factors:
Filtration Rate
=
+70.06250
+10.81250 * Temperature
+4.93750 * Concentration
+7.31250 * Stirring Rate
-9.06250 * Temperature * Concentration
+8.31250 * Temperature * Stirring Rate
• Residual Analysis (P. 251)
• Response surface (P. 252)
35
DESIGN-EXPERT Plot
Filtration Rate
Normal plot of residuals
99
95
Norm al % probability
90
80
70
50
30
20
10
5
1
-1.83
-0.96
-0.09
0.78
1.65
Studentized Res iduals
36
• Half-normal plot: the absolute value of the effect
estimates against the cumulative normal
probabilities.
DESIGN-EXPERT Plot
Filtration Rate
Temperature
Pressure
Concentration
Stirring Rate
99
97
A
95
Half Norm al % probability
A:
B:
C:
D:
Half Normal plot
90
AC
85
AD
80
D
70
C
60
40
20
0
0.00
5.41
10.81
|Effect|
16.22
21.63
37
• Example 6.3 (Data transformation in a Factorial
Design)
A = drill load, B = flow, C = speed, D = type of mud,
y = advance rate of the drill
38
• The normal probability plot of the effect estimates
DESIGN-EXPERT Plot
adv._rate
load
flow
speed
mud
99
97
B
95
Half Norm al % probability
A:
B:
C:
D:
Half Normal plot
90
C
85
D
80
BD
BC
70
60
40
20
0
0.00
1.61
3.22
4.83
6.44
|Effect|
39
• Residual analysis
DESIGN-EXPERT Plot
Normal plot of residuals
adv._rate
XPERT Plot
Residuals vs. Predicted
2.58625
99
95
1.44875
80
70
Res iduals
Norm al % probability
90
50
0.31125
30
20
10
-0.82625
5
1
-1.96375
-1.96375
-0.82625
0.31125
Res idual
1.44875
2.58625
1.69
4.70
7.70
10.71
13.71
Predicted
40
• The residual plots indicate that there are problems
with the equality of variance assumption
• The usual approach to this problem is to employ a
transformation on the response
• In this example,
y *  ln y
41
DESIGN-EXPERT Plot
Ln(adv._rate)
load
flow
speed
mud
Three main effects are
large
99
97
B
95
Half Norm al % probability
A:
B:
C:
D:
Half Normal plot
90
No indication of large
interaction effects
C
85
D
80
70
What happened to the
interactions?
60
40
20
0
0.00
0.29
0.58
0.87
1.16
|Effect|
42
Response:
adv._rate
Transform: Natural log
Constant: 0.000
ANOVA for Selected Factorial Model
Analysis of variance table [Partial sum of squares]
Sum of
Mean
F
Source Squares DF
Square Value
Prob > F
Model
7.11
3
2.37
164.82 < 0.0001
B
5.35
1
5.35
371.49 < 0.0001
C
1.34
1
1.34
93.05
< 0.0001
D
0.43
1
0.43
29.92
0.0001
Residual 0.17
12
0.014
Cor Total 7.29
15
Std. Dev. 0.12
Mean
1.60
C.V.
7.51
R-Squared
Adj R-Squared
Pred R-Squared
0.9763
0.9704
0.9579
PRESS
Adeq Precision
34.391
0.31
43
• Following Log transformation
Final Equation in Terms of Coded Factors:
Ln(adv._rate) =
+1.60
+0.58 * B
+0.29 * C
+0.16 * D
44
DESIGN-EXPERT Plot
Ln(adv._rate)
Normal plot of residuals
DESIGN-EXPERT Plot
Ln(adv._rate)
Residuals vs. Predicted
0.194177
99
95
0.104087
80
70
Res iduals
Norm al % probability
90
50
0.0139965
30
20
10
-0.0760939
5
1
-0.166184
-0.166184
-0.0760939
0.0139965
Res idual
0.104087
0.194177
0.57
1.08
1.60
2.11
2.63
Predicted
45
• Example 6.4:
– Two factors (A and D) affect the mean number
of defects
– A third factor (B) affects variability
– Residual plots were useful in identifying the
dispersion effect
– The magnitude of the dispersion effects:
F i  ln
*
2

2

S (i )
S (i )
– When variance of positive and negative are
equal, this statistic has an approximate normal
distribution
46
6.6 The Addition of Center Points to
the 2k Design
• Based on the idea of replicating some of the runs
in a factorial design
• Runs at the center provide an estimate of error and
allow the experimenter to distinguish between two
possible models:
k
First-order m odel (interaction) y   0 

i 1
k
S econd-order m odel y   0 

i 1
k
 i xi 
i
xi 
k

i 1
ji
ij
xi x j  
ji
k

i 1
k
k
 ij x i x j 

 ii x i  
2
i 1
47
y F  y C  no "curvature"
The hypotheses are:
k
H 0 :   ii  0
i 1
k
H 1 :   ii  0
i 1
SS Pure Q uad 
n F nC ( y F  y C )
n F  nC
This sum of squares has a
single degree of freedom
48
2
• Example 6.6
nC  5
Usually between 3
and 6 center points
will work well
Design-Expert
provides the analysis,
including the F-test
for pure quadratic
curvature
49
Response: yield
ANOVA for Selected Factorial Model
Analysis of variance table [Partial sum of squares]
Source
Model
A
B
AB
Curvature
Pure Error
Cor Total
Sum of
Squares
2.83
2.40
0.42
2.500E-003
2.722E-003
0.17
3.00
Std. Dev.
Mean
0.21
40.44
R-Squared
Adj R-Squared
C.V.
0.51
Pred R-Squared
N/A
PRESS
N/A
Adeq Precision
14.234
DF
3
1
1
1
1
4
8
Mean
Square
0.94
2.40
0.42
2.500E-003
2.722E-003
0.043
F
Value
21.92
55.87
9.83
0.058
0.063
Prob > F
0.0060
0.0017
0.0350
0.8213
0.8137
0.9427
0.8996
50
• If curvature is significant, augment the design with
axial runs to create a central composite design.
The CCD is a very effective design for fitting a
second-order response surface model
51
```