### Slajd 1

```Analysis of variance and statistical inference
P lo t
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
L ig h t in te n sity
A
A
A
A
A
B
B
B
B
B
B
B
C
C
C
C
C
C
D
D
D
D
D
D
D
B io m a ss
10
11
7
9
9
4
6
3
8
5
9
5
15
10
4
11
6
9
9
5
7
3
5
6
7
P lo t
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
L ig h t
A
A
A
A
A
B
B
B
B
B
B
B
C
C
C
C
C
C
D
D
D
D
D
D
D
B io m as s
10
11
7
9
9
4
6
3
8
5
9
5
15
10
4
11
6
9
9
5
7
3
5
6
7
}
}
}

i 1

SS total
2
}
B
C
2
D

 ni
2
    ( x i , j  x total )

i 1  j 1
k
n i ( x i  x total )
2
2
k
k
SS betw een 
}
A
2
S S w ith in 

i 1
2




 ni

2
  ( xi , j  xi ) 
 j 1

P lo t
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
L ig h t
A
A
A
A
A
B
B
B
B
B
B
B
C
C
C
C
C
C
D
D
D
D
D
D
D
B io m as s
10
11
7
9
9
4
6
3
8
5
9
5
15
10
4
11
6
9
9
5
7
3
5
6
7
}
}
}
}
2
2
2
2
A
S S to ta l  S S b etw een  S S w ith in
}
B
C

2
df total  df betw een  df w ithin
2
F 
D
S S betw een
F 
 between
k  1  N  k S S betw een
S S w ithin
k  1 S S w ithin
N k
 within
2
O b s e rva tio n s
1
2
3
4
5
G ro u p m e a n
S S w ith in
T o ta l S S w ith in
T o ta l S S b e tw e e n
G ra n d m e a n
G ra n d S S
G ra n d S S
S S b e tw e e n + S S w ith in
F
F -te s t
A
0 .0 8
0 .7 1
0 .1 9
0 .5 1
0 .7 3
0 .4 4 5
0 .1 3 1
0 .0 7 0
0 .0 6 5
0 .0 0 4
0 .0 8 2
4 .1 1
1 3 .9 6
1 .0 8
1 .0 0
0 .1 4
0 .7 9
0 .3 2
0 .1 2
1 8 .0 7
1 8 .0 7
1 8 .1 4
2 .1 1 8 E -0 5
T re a tm e n ts
B
C
0 .1 9
0 .8 3
1 .2 1
0 .7 1
1 .9 7
1 .1 0
0 .1 9
0 .1 1
0 .1 9
0 .3 0
0 .7 5 0
0 .6 1 1
0 .3 1 9
0 .0 4 6
0 .2 1 6
0 .0 1 0
1 .4 8 4
0 .2 4 4
0 .3 1 4
0 .2 5 0
0 .3 1 2
0 .0 9 6
S S b e tw e e n
D
2 .8 0
2 .6 9
1 .9 3
2 .5 7
2 .5 8
2 .5 1 5
0 .0 8 1
0 .0 3 2
0 .3 4 2
0 .0 0 4
0 .0 0 4
0 .4 0 4
0 .4 0 4
0 .4 0 4
0 .4 0 4
0 .4 0 4
0 .1 0 9
0 .1 0 9
0 .1 0 9
0 .1 0 9
0 .1 0 9
0 .0 6
0 .1 4
0 .0 0
0 .9 4
0 .6 1
2 .9 6
2 .6 1
0 .7 2
2 .2 3
2 .2 4
2 .0 5 9
2 .0 5 9
2 .0 5 9
2 .0 5 9
2 .0 5 9
k
SS betw een 
 n (x
i
i
 x total )
2
i 1
k
0 .8 0
0 .0 2
0 .7 9
0 .7 9
0 .7 9
0 .2 2 0
0 .2 2 0
0 .2 2 0
0 .2 2 0
0 .2 2 0
S S w ith in 

i 1
 ni
2
    ( x i , j  x total )

i 1  j 1
k
SS total
 ni

2
  ( xi , j  xi ) 
 j 1





Repetitive designs
In medical research we test patients
before and after medical treatment to
infer the influence of the therapy.
We have to divide the total variance
(SStotal) in a part that contains the
variance between patients (SSbetween)
and within the patient (SSwithin).
The latter can be divided in a part that
comes from the treatment (SStreat) and
the error (SSerror)
Medi
cal
treat
ment
k
S S total 
n
  (x
ij
 x)
S S to tal
S S b e tw e e n
S S w ithin
S S tre a t
S S E rro r
SS total  SS between  SS within  SS between  SS treat  SS error
2
j1 i 1
df total  df betw een  df w ithin  df betw een  df treat  df error
n
S S betw en  k  (Pi  x )
2
kn  1  n  1  n (k  1)  n  1  k  1  (n  1)(k  1)
i 1
k
S S w ithin 
n
  (x
ij
 Pi )
2
j1 i 1
k
n  (T j  x )
k
S S treat  n  (Ti  x )
2
F
j1
k
S S error 
S S error df treat
n
  (x
j1 i 1
ij
 Pi  T j  x )
S S treat df error
2

2
(n  1)(k  1)
j1
k
n
  (x
j1 i 1
ij
 Pi  T j  x )
2
k 1
P e rs o n
1
2
3
4
5
6
7
8
9
10
M e a n w ith in
G ra n d m e a n
S S b e tw e e n
df
M o rn in g
104
104
84
97
81
108
89
82
84
95
9 2 .8
A fte rn o o n
119
96
114
114
111
116
96
96
129
120
1 1 1 .0
9 8 .7
S q u a re s
S S w ith in
df
2 0 .2
2 9 .1
1 9 6 .8
1 8 .5
2 0 3 .8
1 5 .4
1 2 .1
3 0 .4
3 1 0 .3
2 1 .1
8 5 7 .6
1 0 6 .8
3 .5
2 4 6 .3
1 6 8 .6
2 3 0 .4
1 4 5 .9
9 .9
6 9 .3
7 4 7 .1
4 1 0 .7
2 1 3 8 .4
3 4 .2
1 2 .4
2 .8
7 5 .3
0 .8
2 5 5 .9
0 .1
7 .9
9 4 .5
2 4 5 .7
7 2 9 .6
3 4 .8
1 5 2 .4
4 1 .6
n=10
S S tre a t
df
S q u a re s
S S e rro r
df
F
p (F )
N ig h t
102
95
96
92
95
88
93
85
91
84
9 2 .2
2 .0
1 2 7 .5
6 6 .1
2 .5
7 0 .2
9 6 .4
5 .8
0 .1
1 3 7 .3
1 .7
Sum
1 4 3 7 .6
1 8 .0
1 4 .3 2 4
0 .0 6 7
4 .0
2 0 2 .3
1 1 .2
0 .4
8 .0
0 .1
8 4 .6
1 6 .2
2 2 4 .6
6 2 .7
M e a n p e rs o n s
1 0 8 .2
9 8 .3
9 8 .0
1 0 1 .1
9 5 .5
1 0 4 .3
9 2 .6
8 7 .9
1 0 1 .2
9 9 .7
S q u a re s
9 0 .2
0 .1
0 .4
5 .9
1 0 .2
3 1 .6
3 6 .5
1 1 6 .6
6 .4
1 .0
sum
k=3
2 9 9 .0
8 9 7 .0
M e a n w ith in
G ra n d m e a n
S S b e tw e e n
df
S q u a re s
M o rn in g
-4
5
-1 4
-4
-1 4
4
-3
-6
-1 8
-5
-5 .9
Sum
3 7 2 5 .6
2 0 .0
2 2 8 .8
2 2 8 8 .0
2 .0
k
S S w ith in
df
 df
k
S q u a re s
0 .0
0 .0
0 .0
0 .0
0 .0
0 .0
0 .0
0 .0
0 .0
0 .0
sum
k=3
0 .0
0 .0
2


k
  (x
(x  P )
1 5 2 .4
S S treat  n  (Ti  x )
4 1 .6
n=10
2
j1
2 .0
k
1 2 7 .5
6 6 .1
2 .5
j1
7 0 .2
9 6 .4
5 .8
0 .1
1 3 7 .3
1 .7
Sum
4 .0
2 0 2 .3
1 1 .2 ij
0 .4
i 1
8 .0
0 .1
8 4 .6
1 6 .2
2 2 4 .6
6 2 .7
n
  (x
1 4 3 7 .6
1 8 .0
1 4 .3 2 4
0 .0 6 7
k
n (n  1)  T j
(P  x )
j1

df
S S error 
F
p (F )
 x)
2 0 .2
1 0 6 .8ij
3 4 .2
j  1 i  1 3 .5
2 9 .1
1 2 .4
1 9 6 .8
2 4 6 .3
2 .8
n
1 8 .5
1 6 8 .6
7 52.3
2 0 3 .8
2 3 0 .4
0 .8
betw en
i
treat
error
1 5 .4
1 4 5 .9
2 5 5 .9
i 1
k 0 n.1
1 2 .1
9 .9
error
n6 9 .3
3 0 .4 k treat
7 .9
3 1 0 .3
7 4 7 .1
9 4 .5 2
w ithin 2 1 .1
4 1 0 .7 ijj  1 2 4i 
5i1.7
8 5 7 .6 j  1 i 21 3 8 .4
7 2 9 .6
3 4 .8
S S e rro r
df
M e a n p e rs o n s
0 .0
0 .0
0 .0
0 .0
0 .0
0 .0
0 .0
0 .0
0 .0
0 .0
n
  (x
S S total 
S q u a re s
4 6 2 2 .6
4 6 2 2 .6
3 7 2 5 .6
3 7 2 5 .6
N ig h t
-6
-4
-2
-9
-1
-1 6
0
-3
-1 0
-1 6
-6 .4
0 .0
SS
S S tre a t
df
A fte rn o o n
10
-2
16
13
15
12
3
8
27
20
1 2 .3
Ipsative data
SSSS
F
SS
0 .4
8 .6
2 2 .9
5 .0
3 0 .7
9 1 .2
4 6 .0
1 3 .2
1 0 .7
8 5 .2
G ra n d S S
S S w ith in + S S b e tw e e n
S S w ith in
S S tre a t+ S S e rro r
P e rs o n
1
2
3
4
5
6
7
8
9
10
0 .4
8 .6
2i2 .9
5 .0
3 0 .7
9 1 .2
4 6 .0
1 3 .2
1 0 .7
8 5 .2
ij
 Tj )
Sum
3 7 2 5 .6
2 0 .0
2 2 8 .8
2 2 8 8 .0
2 .0
 P  Tj  x )
G ra n d S S
S S w ith in + S S b e tw e e n
S S w ith in
S S tre a t+ S S e rro r
2
2
3 7 2 5 .6
3 7 2 5 .6
3 7 2 5 .6
3 7 2 5 .6
2
Spiders from two Mazuarian lake ensembles
Summary statistics
Islands
IslandEns
emble
Area
GórnaE
MNB
0.7
Koń
MNB
0.5
Kopanka
MNB
0.69
KrólewskiOstrów
MNB
6.15
Maleńka
MNB
0.0003
MałaWierzba MNB
0.4
KopankaN
MNB
0.18
Ośrodek
MNB
0.09
Piaseczna
MNB
0.63
Ruciane
MNB
10
Mikołajki
MNB
10
Śluza
MNB
0.48
GórnaW
MNB
0.44
Wierzba
MNB
0.78
Wygryńska MNB
0.67
Bryzgiel
Wigry
0.2
Bryzgiel
Wigry
10
BrzozowaL Wigry
3.81
BrzozowaP Wigry
2.32
CimochowskiGrądzikC
Wigry
0.15
CimochowskiGrądzikN
Wigry
0.14
CimochowskiGrądzikS
Wigry
0.76
Kamień
Wigry
3.13
Krowa
Wigry
4.49
MysiaWigry Wigry
1.55
Ordów
Wigry
8.69
Ostrów
Wigry
38.82
Rośków
Wigry
0.56
Walędziak
Wigry
0.76
WysokiWęgiełWigry
10
Distfromn
earestMai
nland
Latitude
200
161
131
123
40
180
237
5
290
0
0
30
287
160
120
30
0
220
180
40
170
70
170
120
60
140
190
100
30
0
53.63397
53.62628
53.62765
53.63025
53.64253
53.76169
53.63135
53.63747
53.68015
53.62833
53.7854
53.66271
53.6343
53.7592
53.68694
54.00219
54.00886
54.02619
54.02658
54.05194
54.05203
54.04875
54.02625
54.01289
54.07183
54.00739
54.00636
54.00217
54.00344
54.03497
Organ
Temp SoilH
SoilD
SoilF SoilA
icMatt Distur Speci
Individuals
Longitude Traps Light eratur umidit
disper
Individ
ertility cidity
eCont bance es
/trap
e
y
sion
ent
21.54726
21.51939
21.54238
21.54168
21.56479
21.60678
21.54744
21.54389
21.56987
21.52552
21.58318
21.5731
21.54433
21.6061
21.56201
23.07553
23.09219
23.10886
23.12553
23.07553
23.07553
23.07553
23.12553
23.09219
23.09219
23.05886
23.07553
23.07553
23.05886
23.12553
4
4
4
12
1
4
3
2
11
5
15
4
5
8
5
3
6
3
3
3
3
3
11
6
6
10
15
3
3
10
3.04
3.15
3.37
3.23
3.84
3.74
3.55
3.47
3.49
3.06
3.23
3.36
3.13
3.24
3.61
3.77
3.52
3.62
3.68
3.57
3.55
3.61
3.75
3.72
3.76
3.71
3.68
3.66
3.70
3.45
3.67
3.54
3.52
3.66
3.52
3.61
3.67
3.54
3.65
3.55
3.49
3.51
3.56
3.57
3.48
3.70
3.51
3.59
3.58
3.55
3.52
3.65
3.62
3.57
3.63
3.64
3.55
3.65
3.56
3.57
3.20
3.41
3.24
3.24
4.55
4.76
3.08
4.11
3.33
3.37
3.44
3.92
3.24
3.55
3.54
4.49
3.58
4.47
4.57
4.36
4.63
4.20
3.26
4.64
3.85
3.15
3.33
4.24
4.41
3.59
3.25
3.89
3.11
3.68
3.77
3.66
3.27
3.81
3.98
3.77
3.70
3.82
3.48
3.89
3.91
3.36
3.37
3.33
3.53
3.51
3.66
3.51
3.54
3.36
3.70
3.42
3.53
3.24
3.49
3.44
3.78
4.01
3.50
3.94
4.11
4.05
3.74
4.04
4.16
3.96
3.93
4.05
3.94
4.10
4.10
3.75
3.76
3.63
3.79
3.80
3.89
3.89
4.03
3.72
4.00
3.99
3.96
3.63
3.81
3.86
3.92
3.93
3.82
3.99
4.04
4.35
3.82
3.89
3.76
3.83
3.90
4.00
3.79
3.80
3.67
3.65
3.74
3.51
3.76
3.73
3.86
3.79
3.82
3.56
3.82
3.82
3.83
3.72
3.85
3.72
2.11
2.08
2.08
2.00
2.49
2.58
2.02
2.38
2.06
2.05
2.10
2.19
2.06
2.10
2.19
2.71
2.24
2.75
2.69
2.60
2.70
2.50
2.09
2.79
2.33
2.08
2.17
2.51
2.59
2.27
1
4
1
2
3
4
1
4
1
3
3
4
1
3
2
4
4
4
4
4
4
4
3
4
3
3
4
4
4
4
33
25
34
51
6
27
32
22
38
28
75
19
29
47
43
21
46
31
30
25
31
25
60
34
49
64
93
24
28
57
149
275
349
920
12
83
92
128
616
176
673
281
204
687
912
124
244
360
232
188
258
170
440
347
386
587
914
154
88
307
37.25
68.75
87.25
76.66667
12
20.75
30.66667
64
56
35.2
44.86667
70.25
40.8
85.875
182.4
41.33333
40.66667
120
77.33333
62.66667
86
56.66667
40
57.83333
64.33333
58.7
60.93333
51.33333
29.33333
30.7
Starting hyotheses
• The degree of disturbance (human impact) influences species richenss.
• Species richness and abundance depends on island area and environmental
afctors.
• Island ensembles differ in species richness and abundance.
• Area, abundance, and species richness are non-linearly related.
• Latitude and longitude do not influence species richness.
Sorting
• Area, abundance, and species richness are nonlinearly related.
• Latitude and longitude do not influence species
richness.
• Species richness and abundance depends on
island area and environmental factors.
• Island ensembles differ in species richness and
abundance.
• The degree of disturbance (human impact)
influences species richenss.
The hypotheses are not
independent.
Each hypothesis influences the
way how to treat the next.
• Area, abundance, and species richness are non-linearly related.
Species – area and individuals area relationships
Islands
IslandEns
emble
Area
GórnaE
MNB
0.7
Koń
MNB
0.5
Kopanka
MNB
0.69
KrólewskiOstrów
MNB
6.15
Maleńka
MNB
0.0003
MałaWierzba MNB
0.4
KopankaN
MNB
0.18
Ośrodek
MNB
0.09
Piaseczna
MNB
0.63
Ruciane
MNB
10
Mikołajki
MNB
10
Śluza
MNB
0.48
GórnaW
MNB
0.44
Wierzba
MNB
0.78
Wygryńska MNB
0.67
Bryzgiel
Wigry
0.2
Bryzgiel
Wigry
10
BrzozowaL Wigry
3.81
BrzozowaP Wigry
2.32
CimochowskiGrądzikC
Wigry
0.15
CimochowskiGrądzikN
Wigry
0.14
CimochowskiGrądzikS
Wigry
0.76
Kamień
Wigry
3.13
Krowa
Wigry
4.49
MysiaWigry Wigry
1.55
Ordów
Wigry
8.69
Ostrów
Wigry
38.82
Rośków
Wigry
0.56
Walędziak
Wigry
0.76
WysokiWęgiełWigry
10
Distfromn
earestMai
nland
Latitude
200
161
131
123
40
180
237
5
290
0
0
30
287
160
120
30
0
220
180
40
170
70
170
120
60
140
190
100
30
0
53.63397
53.62628
53.62765
53.63025
53.64253
53.76169
53.63135
53.63747
53.68015
53.62833
53.7854
53.66271
53.6343
53.7592
53.68694
54.00219
54.00886
54.02619
54.02658
54.05194
54.05203
54.04875
54.02625
54.01289
54.07183
54.00739
54.00636
54.00217
54.00344
54.03497
Organ
Temp SoilH
SoilD
SoilF SoilA
icMatt Distur Speci
Individuals
Longitude Traps Light eratur umidit
disper
Individ
ertility cidity
eCont bance es
/trap
e
y
sion
ent
21.54726
21.51939
21.54238
21.54168
21.56479
21.60678
21.54744
21.54389
21.56987
21.52552
21.58318
21.5731
21.54433
21.6061
21.56201
23.07553
23.09219
23.10886
23.12553
23.07553
23.07553
23.07553
23.12553
23.09219
23.09219
23.05886
23.07553
23.07553
23.05886
23.12553
4
4
4
12
1
4
3
2
11
5
15
4
5
8
5
3
6
3
3
3
3
3
11
6
6
10
15
3
3
10
3.04
3.15
3.37
3.23
3.84
3.74
3.55
3.47
3.49
3.06
3.23
3.36
3.13
3.24
3.61
3.77
3.52
3.62
3.68
3.57
3.55
3.61
3.75
3.72
3.76
3.71
3.68
3.66
3.70
3.45
3.67
3.54
3.52
3.66
3.52
3.61
3.67
3.54
3.65
3.55
3.49
3.51
3.56
3.57
3.48
3.70
3.51
3.59
3.58
3.55
3.52
3.65
3.62
3.57
3.63
3.64
3.55
3.65
3.56
3.57
3.20
3.41
3.24
3.24
4.55
4.76
3.08
4.11
3.33
3.37
3.44
3.92
3.24
3.55
3.54
4.49
3.58
4.47
4.57
4.36
4.63
4.20
3.26
4.64
3.85
3.15
3.33
4.24
4.41
3.59
3.25
3.89
3.11
3.68
3.77
3.66
3.27
3.81
3.98
3.77
3.70
3.82
3.48
3.89
3.91
3.36
3.37
3.33
3.53
3.51
3.66
3.51
3.54
3.36
3.70
3.42
3.53
3.24
3.49
3.44
3.78
4.01
3.50
3.94
4.11
4.05
3.74
4.04
4.16
3.96
3.93
4.05
3.94
4.10
4.10
3.75
3.76
3.63
3.79
3.80
3.89
3.89
4.03
3.72
4.00
3.99
3.96
3.63
3.81
3.86
3.92
3.93
3.82
3.99
4.04
4.35
3.82
3.89
3.76
3.83
3.90
4.00
3.79
3.80
3.67
3.65
3.74
3.51
3.76
3.73
3.86
3.79
3.82
3.56
3.82
3.82
3.83
3.72
3.85
3.72
2.11
2.08
2.08
2.00
2.49
2.58
2.02
2.38
2.06
2.05
2.10
2.19
2.06
2.10
2.19
2.71
2.24
2.75
2.69
2.60
2.70
2.50
2.09
2.79
2.33
2.08
2.17
2.51
2.59
2.27
1
4
1
2
3
4
1
4
1
3
3
4
1
3
2
4
4
4
4
4
4
4
3
4
3
3
4
4
4
4
33
25
34
51
6
27
32
22
38
28
75
19
29
47
43
21
46
31
30
25
31
25
60
34
49
64
93
24
28
57
149
275
349
920
12
83
92
128
616
176
673
281
204
687
912
124
244
360
232
188
258
170
440
347
386
587
914
154
88
307
37.25
68.75
87.25
76.66667
12
20.75
30.66667
64
56
35.2
44.86667
70.25
40.8
85.875
182.4
41.33333
40.66667
120
77.33333
62.66667
86
56.66667
40
57.83333
64.33333
58.7
60.93333
51.33333
29.33333
30.7
Latitude and longitude do not influence species richness.
Is species richness correlated with
longitude and latitude?
Does the distance between islands influence
species richness? Are geographically near
islands also similar in species richness
irrespective of island area?
R(S-Long) = 0.22 n.s.
R(S-Lat) = 0.28 n.s.)
That there is no significant
correlation does not mean that
latitude and longitude do not have
an influence on the regression
model with environmental
variables.
Spatial
autocorrelation
S2
S1
S3
S4
S5
S6
In spatial autocorrelation the distance between study sites influence
the response (dependent) variable. Spatialy adjacent sites are then
expected to be more similar with respect to the response variable.
Moran’s I as a measure of spatial autocorrelation
Moran’s I is similar to a correlation coefficient all applied to pairwise cells of a spatial matrix. It differs
by weighting the covariance to account for spatial non-independence of cells with respect to
distance.
S2
S1
N
S3
I
i 1
N
w
S4
i 1
S5
S1
2
4
2
5
3
3
w
N
N
j1
N
ij
j 1
N
ij
z
2
i
ziz j
w ij 
1
(1  d ij )
2
i 1
S6
S2
3
6
4
6
6
5
Distance
0.3
0.4
0.7
0.2
0.9
0.6
All combinations of
sites
If cell values were randomly distributed (not
spatially autocorrelated) the expected I is
1
E 0 (I) 
N 1
Statistical significance is calculated
from a Monte Carlo simulation
Individuals/trap is slightly spatially autocorrelated
Latitude and longitude slightly influence species richenss.
Even this weak effect might influence the outcome of a
regression analysis.
Log
transformed
variables
Akaike
information
criterion.
The lower
AIC the more
appropriate
is the model
Errors:
Too many variables!!
Solution: prior factor analysis to
reduce the number of dependent
variables
OLS result
Spatial
autoregression
result
Stepwsie variable reduction
Information criteria
3300
3000
2
y = 16.113x - 185.67x + 533.84
R² = 0.8953
2500
2800
2300
2000
1500
Y
Y
y = -0.375x4 + 14.462x3 - 164.12x2 + 609.02x
- 356.84
R² = 0.9607
1800
1300
1000
800
y = 45.502e0.1932x
R² = 0.5286
500
0
0
5
10
X
15
y = 90.901x
R² = 0.5747
300
-200
20
0
5
What function fits best?
The more free parameters a model
has the higher will be R2.
The more parsimonious a model is the
lesser is the bias towards type I errors.
10
15
20
X
Explained
variance
Bias
We have to find a compromis
between goodness of fit and bias!
many
few
Model parameters
The optimal
number of
model
parameters
The Akaike criterion of model choice
AIC  2 k  2 ln( L )
k: number of model parameters
L: maximum likelihood estimate of the model
The preferred model is the one with the lowest AIC.
If the parameter errors are
AIC  2 k  n ln( RSS / n )  2 k  n ln( RSS )  n ln( n )
normal and independent we get
n: number data points
RSS: residual sums of squares
n
RSS 
If we fit using
AIC

2
2:
 2k  
 i
2
i 1
2
If we fit using R2:
AIC
R
2
1 R2
 2 k  ln 
 n
At small sample size we should use the following correction
AIC C  AIC 
2 k ( k  1)
nk2



3300
3000
2
y = 16.113x - 185.67x + 533.84
R² = 0.8953
2500
2800
2300
1500
Y
Y
2000
1800
1300
1000
800
y = 45.502e0.1932x
500
R² = 0.5286
0
0
AIC
y = -0.375x4 + 14.462x3 - 164.12x2 + 609.02x
- 356.84
R² = 0.9607
C R2
5
10
X
1 R2
 2 k  ln 
 n
15
 2 k ( k  1)
 
 nk2
-200
20
5
10
15
20
6 ( 3  1)
 1  0 . 8953 
AIC C R 2  6  ln 

 2 . 51

19

 19  3  2
AIC
AIC
A single outlier makes the difference. The
single high residual makes the exponential
fitting worse
0
X
AIC
We get the surprising result that the
seemingly worst fitting model appears
to be the preferred one.
y = 90.901x
R² = 0.5747
300
CR
2
4 ( 2  1)
 1  0 . 5286 
 4  ln 
 1 . 10

19
19

2

2


CR
2
 1  0 . 9607  10 ( 5  1)
 10  ln 
 8 . 82

19

 19  5  2
CR
2
2 (1  1)
 1  0 . 5747 
 2  ln 
  1 . 56

19
19

1

2


Significant difference in model fit
 AIC  AIC 1  AIC
2
Approximately AIC is statisticaly significant in favor of the
model with thesmaller AIC at the 5% error benchmark if
|AIC| > 2.
6 ( 3  1)
 1  0 . 8953 
AIC C R 2  6  ln 
 2 . 51

19

 19  3  2
AIC
AIC
AIC
CR
2
4 ( 2  1)
 1  0 . 5286 
 4  ln 
 1 . 10

19
19

2

2


CR
2
 1  0 . 9607  10 ( 5  1)
 10  ln 
 8 . 82

19
19

5

2


CR
2
2 (1  1)
 1  0 . 5747 
 2  ln 
  1 . 56

19
19

1

2


 AIC   1 . 56  1 . 10  2 . 66
The last model is
significantly (5% level)
the best.
Stepwise variable elimination
Standardized coefficients (b-values) are
equivalents of correlation coefficients. They
should have values above 1.
Such values point to too high correlation between
the predictor variables (collinearity).
Collnearity disturbs any regression model and has
to be eliminated prior to analysis.
Highly correlated variables essentially contain the
same information.
Correlations of less than 0.7 can be tolerated.
Hence check first the matrix of correlation
coefficients.
Eliminate variables that do not add information.
The final model
Simple test wise
probability levels.
We yet have to
correct for
multiple testing.
The best model is
not always the one
with the lowest AIC
or the highest R2.
Bonferroni correction
p ( I )    p ( I )  1  
p (  I n )  1  
n
p ( I n )  1  1  
n
p ( I n )  1  (1  n  )  n 
 
p(In )
n
To get an experiment
wise error rate of 0.05
our test wise error rates
have be less than 0.05/n
Species richness is positively correlated with island area and negatively with
soil humidity.
Island ensembles differ in species richness and abundance.
A simple ANOVA does not detect any difference
Species richness depends on environmental factors that may differ
between island ensembles.
Analysis of covariance (ANCOVA)
Analysis of covariance (ANCOVA)
ANCOVA is the combination of multiple
regression and analysis of variance.
First we perform a regression anlyis and use the
residuals of the full model as entries in the
ANOVA.
ANCOVA is the ANOVA on regression residuals.
100.0
y = 0.9377x + 2.6159
R² = 0.843
90.0
The metrically scaled variables serve
as covariates.
Sites with very high positive
residuals are particularly
species rich even after
controlling for environmental
factors.
These are ecological hot
spots.
Regression analysis serves to
identify such hot spots
We use the
regression
residuals for
further analysis
Observed value
80.0
70.0
60.0
50.0
40.0
30.0
20.0
10.0
0.0
0
20
40
60
80
Predicted value
100
120
ANCOVA
Islands
IslandEns
emble
MNB
GórnaE
MNB
Koń
MNB
Kopanka
MNB
KrólewskiOstrów
MNB
Maleńka
MałaWierzba MNB
MNB
KopankaN
MNB
Ośrodek
MNB
Piaseczna
MNB
Ruciane
MNB
Mikołajki
MNB
Śluza
MNB
GórnaW
MNB
Wierzba
Wygryńska MNB
Wigry
Bryzgiel
Wigry
Bryzgiel
BrzozowaL Wigry
BrzozowaP Wigry
Wigry
CimochowskiGrądzikC
Wigry
CimochowskiGrądzikN
Wigry
CimochowskiGrądzikS
Wigry
Kamień
Wigry
Krowa
MysiaWigry Wigry
Wigry
Ordów
Wigry
Ostrów
Wigry
Rośków
Wigry
Walędziak
WysokiWęgiełWigry
Area
0.70
0.50
0.69
6.15
0.00
0.40
0.18
0.09
0.63
10.00
10.00
0.48
0.44
0.78
0.67
0.20
10.00
3.81
2.32
0.15
0.14
0.76
3.13
4.49
1.55
8.69
38.82
0.56
0.76
10.0
Organ
SoilD
Temp SoilH
Individuals
icMatt Distur Speci
SoilF SoilA
Individ
disper
Traps Light eratur umidit
/trap
eCont bance es
ertility cidity
sion
y
e
ent
4.0
4.0
4.0
12.0
1.0
4.0
3.0
2.0
11.0
5.0
15.0
4.0
5.0
8.0
5.0
3.0
6.0
3.0
3.0
3.0
3.0
3.0
11.0
6.0
6.0
10.0
15.0
3.0
3.0
10
3.04
3.15
3.37
3.23
3.84
3.74
3.55
3.47
3.49
3.06
3.23
3.36
3.13
3.24
3.61
3.77
3.52
3.62
3.68
3.57
3.55
3.61
3.75
3.72
3.76
3.71
3.68
3.66
3.70
3.45
3.67
3.54
3.52
3.66
3.52
3.61
3.67
3.54
3.65
3.55
3.49
3.51
3.56
3.57
3.48
3.70
3.51
3.59
3.58
3.55
3.52
3.65
3.62
3.57
3.63
3.64
3.55
3.65
3.56
3.57
3.20
3.41
3.24
3.24
4.55
4.76
3.08
4.11
3.33
3.37
3.44
3.92
3.24
3.55
3.54
4.49
3.58
4.47
4.57
4.36
4.63
4.20
3.26
4.64
3.85
3.15
3.33
4.24
4.41
3.59
3.25
3.89
3.11
3.68
3.77
3.66
3.27
3.81
3.98
3.77
3.70
3.82
3.48
3.89
3.91
3.36
3.37
3.33
3.53
3.51
3.66
3.51
3.54
3.36
3.70
3.42
3.53
3.24
3.49
3.44
3.78
4.01
3.50
3.94
4.11
4.05
3.74
4.04
4.16
3.96
3.93
4.05
3.94
4.10
4.10
3.75
3.76
3.63
3.79
3.80
3.89
3.89
4.03
3.72
4.00
3.99
3.96
3.63
3.81
3.86
3.92
3.93
3.82
3.99
4.04
4.35
3.82
3.89
3.76
3.83
3.90
4.00
3.79
3.80
3.67
3.65
3.74
3.51
3.76
3.73
3.86
3.79
3.82
3.56
3.82
3.82
3.83
3.72
3.85
3.72
2.11
2.08
2.08
2.00
2.49
2.58
2.02
2.38
2.06
2.05
2.10
2.19
2.06
2.10
2.19
2.71
2.24
2.75
2.69
2.60
2.70
2.50
2.09
2.79
2.33
2.08
2.17
2.51
2.59
2.27
1.0
4.0
1.0
2.0
3.0
4.0
1.0
4.0
1.0
3.0
3.0
4.0
1.0
3.0
2.0
4.0
4.0
4.0
4.0
4.0
4.0
4.0
3.0
4.0
3.0
3.0
4.0
4.0
4.0
4.0
33.0
25.0
34.0
51.0
6.0
27.0
32.0
22.0
38.0
28.0
75.0
19.0
29.0
47.0
43.0
21.0
46.0
31.0
30.0
25.0
31.0
25.0
60.0
34.0
49.0
64.0
93.0
24.0
28.0
57.0
149.0
275.0
349.0
920.0
12.0
83.0
92.0
128.0
616.0
176.0
673.0
281.0
204.0
687.0
912.0
124.0
244.0
360.0
232.0
188.0
258.0
170.0
440.0
347.0
386.0
587.0
914.0
154.0
88.0
307.0
37.3
68.8
87.3
76.7
12.0
20.8
30.7
64.0
56.0
35.2
44.9
70.3
40.8
85.9
182.4
41.3
40.7
120.0
77.3
62.7
86.0
56.7
40.0
57.8
64.3
58.7
60.9
51.3
29.3
30.7
Species richness does not differ between island ensembles.
Model
Residuals
3.569704
3.468026
3.500137
3.963326
1.854909
3.173658
3.391454
2.977121
3.719825
3.823174
4.003134
3.079715
3.388073
3.29098
3.754503
3.076486
3.90665
3.623234
3.568952
3.060985
3.127977
3.322735
4.025703
3.593559
3.761767
4.248705
4.584715
3.041533
3.390937
3.955794
-0.073
-0.249
0.026
-0.032
-0.063
0.122
0.074
0.114
-0.082
-0.491
0.314
-0.135
-0.021
0.559
0.007
-0.032
-0.078
-0.189
-0.168
0.158
0.306
-0.104
0.069
-0.067
0.130
-0.090
-0.052
0.137
-0.059
0.087
• The degree of disturbance (human impact) influences species richenss.
120.0
Observed value
100.0
y = 0.9243x + 3.3687
R² = 0.8364
80.0
60.0
40.0
20.0
0.0
0
20
40
60
80
100
120
Predicted value
Species richness of spiders on lake islands appears to be independent of the degree of disturbance
How does abundance depend on environmental fatcors?
The ful model and
stepwise variable
elimination
All coefficients are
highly significant!
All standardized
coefficients are above 1.
This points to too high
collinearity
We furthr eliminate
uninformative variables.
Abundance does not
significally depend on
environmental variables
How does abundance depend on the degree of disturbance?
Abundance of spiders on lake islands appears to be independent of the degree of disturbance
```