### MJK_ch5

```Autoregressive Integrated
Moving Average (ARIMA)
models
1
- Forecasting techniques based on exponential smoothing
-General assumption for the above models: times series
data are represented as the sum of two distinct components
(deterministc & random)
- Random noise: generated through independent shocks to
the process
-In practice: successive observations show serial
dependence
2
ARIMA Models
- ARIMA models are also known as the Box-Jenkins methodology
- very popular : suitable for almost all time series & many times generate more
accurate forecasts than other methods.
- limitations:
If there is not enough data, they may not be better at forecasting than the
decomposition or exponential smoothing techniques.
Recommended number of observations at least 30-50
- Weak stationarity is required
- Equal space between intervals
3
4
5
Linear Models for Time series
6
Linear Filter
¥
yt = L ( xt ) ) = å yi xt-i , t = ...., -1, 0,1,....
i=-¥
- It is a process that converts the input xt, into output yt
- The conversion involves past, current and future values of the input in
the form of a summation with different weights
- Time invariant  do not depend on time
- Physically realizable: the output is a linear function of the current and
past values of the input
- Stable if    
i

i  
i
In linear filters: stationarity of the input time series is also
reflected in the output
7
Stationarity
8
A time series that fulfill these conditions tends to return to its mean and
fluctuate around this mean with constant variance.
Note: Strict stationarity requires, in addition to the conditions of weak
stationarity, that the time series has to fulfill further conditions about its
distribution including skewness, kurtosis etc.
Determine stationarity
-Take snaphots of the process at different time points & observe its
behavior: if similar over time then stationary time series
-A strong & slowly dying ACF suggests deviations from stationarity
9
10
11
Infinite Moving Average
Input xt stationary
Output yt
Stationary, with
+¥
yt = å yi xt-i
i=-¥
+¥
E(yt ) = m y = å yi m x
i=-¥
¥
&
¥
Cov ( yt, yt+k ) = g y ( k ) = å å yiy jg x (i - j + k )
i=-¥ j=-¥
THEN,
the linear process with white noise time series εt
+¥
yt = m + åyiet-i
i=0
Is stationary
εt independent random shocks, with E(εt)=0 & g e ( h) =
{
s 2 , if h=0
0, if h¹0
12
autocovariance function
¥
¥
g y ( k ) = ååyiy jg e (i - j + k )
i=0 j=0
=s
¥
2
åy y
i
i+k
i=0
Linear Process
+¥
yt = m + åyiet-i
i=0
yt    0 t  1 t 1  2 t 2  

   ( i B i ) t
i 0
    B  t
Infinite moving
average
13
The infinite moving average serves as a general class of models for any
stationary time series
THEOREM (World 1938):
Any no deterministic weakly stationary time series yt can be
represented as
+¥
yt = m + åyiet-i
i=0
¥
where
åy
2
i
<¥
i=0
INTERPRETATION
A stationary time series can be seen as the weighted sum of the present
and past disturbances
14
Infinite moving average:
- Impractical to estimate the infinitely weights
- Useless in practice except for special cases:
i. Finite order moving average (MA) models : weights set to 0,
except for a finite number of weights
ii. Finite order autoregressive (AR) models: weights are
generated using only a finite number of parameters
iii. A mixture of finite order autoregressive & moving average
models (ARMA)
15
Finite Order Moving Average (MA) process
Moving average process of order q(MA(q))
y0 =1, q weights not set to 0
yt     t  1 t 1     q t q
 t white noise
MA(q) : always stationary regardless of the values of the weights
yt    (1  1 B     q B q ) t
q


   1    i B i  i
 i 1

   B  t
q
Q ( B) =1- åqi Bi
i=1
16
εt white noise
Expected value of MA(q)
Variance of MA(q)
E  yt   E    t  1 t 1     q t q 

Var yt    y 0  Var    t  1 t 1     q t q 

  2 1  12    q2
Autocovariance of MA(q)
 y k   E t  1 t 1     q t q  t  k  1 t  k 1     q t  k q 
      , k 1, 2,.,q

Autocorelation of MA(q)


2
0, k  q
k
1 k 1

k q q
 y k    
 y k  
 0, k  q
 y 0
k
1 k 1  k q q
/ 1
2
2
1  q
,
k 1, 2 ,.,q
17
ACF function:
Helps identifying the MA model & its appropriate order as its cuts off after lag k
Real applications:
r(k) not always zero after lag q; becomes very small in absolute value after lag q
18
First Order Moving Average Process MA(1)
q=1
Autocovariance of MA(q)
yt = m + et - q1et-1
et white noise
g y (0) = s 2 (1+ q12 )
g y (1) = -q1s 2
g y (k) = 0, k > 1
Autocorelation of MA(q)
 y 1 
 1
1  12
 y (k )  0, k  1
 y 1 
1
1

1  12 2
19
- Mean & variance : stable
- Short runs where
successive observations
tend to follow each other
- Positive autocorrelation
- Observations oscillate
successively
- negative autocorrelation
20
Second Order Moving Average MA(2) process
yt     t  1 t 1   2 t 2


   1  1 B   2 B 2  t
Autocovariance of MA(q)
g y (0) = s 2 (1+ q12 + q 22 )
g y (1) = s 2 (-q1 + q1q 2 )
g y (2) = s 2 (-q 2 )
g y (k) = 0, k > 2
Autocorelation of MA(q)
r y (1) =
-q1 + q1q 2
1+ q12 + q 22
-q 2
1+ q12 + q 22
r y (k) = 0, k > 2
r y ( 2) =
21
The sample ACF cuts off after lag 2
22
Finite Order Autoregressive Process
- World’s theorem: infinite number of weights, not helpful in
modeling & forecasting
- Finite order MA process: estimate a finite number of weights,
set the other equal to zero
Oldest disturbance obsolete for the next observation; only finite
number of disturbances contribute to the current value of time
series
- Take into account all the disturbances of the past :
use autoregressive models; estimate infinitely many weights that
follow a distinct pattern with a small number of parameters
23
First Order Autoregressive Process, AR(1)
+¥
yt = m + åyie t-i
i=0
æ¥
ö
i
= m + çåyi B et ÷
è i=0
ø
q
Y ( B) =1- åyi Bi
i=1
= m + Y ( B) et
Assume : the contributions of the disturbances that are way in the past are small
compared to the more recent disturbances that the process has experienced
Reflect the diminishing magnitudes of contributions of the disturbances of the
past,through set of infinitely many weights in descending magnitudes , such as
yi = f i , f < 1
Exponential decay pattern
The weights in the disturbances starting from the current disturbance and
going back in the past:
1,, 2 , 3 ,
24
yt     t   t 1   2 t  2  

     i  t i
i 0
yt 1     t 1   t  2   2 t 3  
THEN
yt     t   t 1   2 t  2  
     yt 1   t
   yt 1   t
where
First order
autoregressive process
AR(1)
d = (1- f ) m
WHY AUTOREGRESSIVE ?
AR(1) stationary if
f <1
¥
Þ å yi < ¥
i=0
25
Mean AR(1)
E ( yt ) = m =
Autocovariance function
AR(1)
d
1- f
 k    2 k
Autocorrelation function AR(1)
1
, k  0,1,2
2
1
 k  
 k  k
  , k  0,1,2
 0
The ACF for a stationary AR(1) process has an exponential decay form
26
Observe:
- The observations exhibit up/down movements
27
Second Order Autoregressive Process, AR(2)
y t    1 yt 1  2 yt  2   t ,
  1   
This model can be represented in the infinite MA form & provide the
conditions of stationarity for yt in terms of φ1 & φ2
WHY?
1. Infinite MA
yt  1 yt 1  2 yt  2     t
yt  1 Byt  2 B 2 yt     t
(1  1 B  2 B 2 ) yt     t
 B  yt     t
Apply
B 
1
28
yt    B      B   t
1
1
    B  t

    i t i
i 0

    i B i t
i 0
where
  B 1
&

B    i B i   B 
1
i 0
29
Calculate the weights  i

BB  1
B   i Bi  B
1
i 0
1   B   B 
2
1
2
0

 1 B  2 B 2    1
 0   1  1 0 B   2  1 1  2 0 B 2     j  1 j 1  2 j  2 B j    1
We need
0 1
 1  1 0   0
 j  1 j 1  2 j 2   0, for all j  2,3,
30
Solutions
The  satisfy the second-order linear difference equation
The solution : in terms of the 2 roots m1 and m2 from
j
m 2  1m  2  0
m1 , m2 
AR(2) stationary:
1  12  42
2
if m1 , m2  1,


i 0
i

Condition of stationarity for complex conjugates a+ib:
 2 b2  1
AR(2) infinite MA representation: m1 , m2  1
31
Mean
E  yt     1E  yt 1   2 E  yt 2 
    1  2 


1  1  2
For 1  1  2 , m  1 : nonstationarity
Autocovariance
function
 k   cov yt , yt  k 
 cov  1 yt 1  2 yt  2   t , yt  k 
 1 cov yt 1 , yt  k   2 cov yt  2 , yt  k   cov t , yt  k 
 1 k  1  2 k  2 
For k=0:
For k>0:

 2 , if k  0
0 , if k  0
 0  1 1  2 2   2
 k   1 k 1  2 k  2, k  1,2 Yule-Walker equations
32
Autocorrelation
function
 k   1 k 1  2  k  2, k  1,2,
Solutions
A. Solve the Yule-Walker equations recursively
 1  1  0  2   1

  1  1
1  2
 2  1  1  2
 3  1  2  2  1

B. General solution
Obtain it through the roots m1 & m2 associated with the polynomial
m2  1m  2  0
33
Case I: m1, m2 distinct real roots
 k   c1m1k  c2m2k , k  0,1,2,
c1, c2 constants: can be obtained from ρ (0) ,ρ(1)
stationarity: m1 , m2  1
ACF form: mixture of 2 exponentially decay terms
e.g. AR(2) model
It can be seen as an adjusted AR(1) model for which a single exponential decay
expression as in the AR(1) is not enough to describe the pattern in the ACF and
thus, an additional decay expression is added by introducing the second lag
term yt-2
34
Case II: m1, m2 complex conjugates in the form
a  ib
 k   Rk c1 cosk   c2 sink , k  0,1,2,
R  mi  a 2  b 2
cos( )  a / R
sin( )  b / R
a  ib  Rcos( )  i sin  
c1, c2: particular constants
ACF form: damp sinusoid; damping factor R;
2 / 
frequency  ; period
35
Case III: one real root m0; m1= m2=m0
 k   c1  c2k m0k , k  0,1,2,
ACF form: exponential decay pattern
36
AR(2) process :yt=4+0.4yt-1+0.5yt-2+et
Roots of the polynomial: real
ACF form: mixture of 2 exponential decay terms
37
AR(2) process: yt=4+0.8yt-1-0.5yt-2+et
Roots of the polynomial: complex conjugates
ACF form: damped sinusoid behavior
38
General Autoregressive Process, AR(p)
Consider a pth order AR model
yt    1 yt 1  2 yt 2    p yt  p   t ,  t white noise
or
  B  yt     t ,
where B   1  1 B  2 B 2     p B p
39
AR(P) stationary
If the roots of the polynomial
m p  1m p1  2m p2   p  0
are less than 1 in absolute value
AR(P) absolute summable infinite MA representation
Under the previous condition

yt     B  t     i t i
i 0

 B   B  &  i  
1
i 0
40
Weights of the random shocks
B B   1
as
 j  0, j  0
0 1
 j  1 j 1  2 j  2     p j  p  0, forall j  1,2, 
41
For stationary AR(p)
E  yt    

1  1  2     p
 k   Cov yt , yt  k 
 Cov  1 yt 1  2 yt  2     p yt  p   t , yt  k 
p
  i k  i  
i 1

 2 , if k  0
0 , if k  0
p
 0    i i    2
i 1
p


  0 1   i  i    2
 i 1

42
ACF
p
 k   i  k  i , k  1,2, pth order linear difference equations
i 1
AR(p) :
-satisfies the Yule-Walker equations
-ACF can be found from the p roots of the associated polynomial
e.g. distinct & real roots :
 k   c1m1k  c2 m2 k    c p mp k
- In general the roots will not be real
ACF : mixture of exponential decay and damped sinusoid
43
ACF
- MA(q) process: useful tool for identifying order of process
cuts off after lag k
- AR(p) process: mixture of exponential decay & damped sinusoid
expressions
Fails to provide information about the order of AR
44
Partial Autocorrelation Function
Consider :
- three random variables X, Y, Z &
- Simple regression of X on Z & Y on Z
Cov(Z, X)
Var(Z)
Cov(Z,Y )
Y = a 2 + b2 Z, where b2 =
Var(Z)
X = a1 + b1Z, where b1 =
The errors are obtained from
X * = X - X = X - (a1 + b1Z )
Y * = Y -Y = Y - (a 2 + b2 Z )
45
Partial correlation between X & Y after adjusting for Z:
The correlation between X* & Y*
(
corr ( X *,Y * ) = corr X - X,Y -Y
)
Partial correlation can be seen as the correlation between two variables
after being adjusted for a common factor that affects them
46
Partial autocorrelation function (PACF) between yt & yt-k
The autocorrelation between yt & yt-k after adjusting for yt-1, yt-2, …yt-k
AR(p) process: PACF between yt & yt-k for k>p should equal zero
Consider
- a stationary time series yt; not necessarily an AR process
- For any fixed value k , the Yule-Walker equations for the ACF of
an AR(p) process
k
r ( j ) = åfik r ( j - i ), j = 1, 2,..., k
i=1
r (1) = f1k + f2k r (1) +… + fkk r ( k -1)
r ( 2) = f1k + f2k r (1) +… + fkk r ( k - 2)
r ( k ) = f1k + f2k r (1) +… + fkk
47
Matrix notation
1

  1

  2 



  k  1












Solutions
 1
1
 1
 k  2 
 2 
 3
1
 k  1 
 k  2 

 k  3
 k  3

1















Pkk  k
k  Pk 1k
For any given k, k =1,2,… the last coefficient kk is called the partial autocorrelation
coefficient of the process at lag k
AR(p) process:
kk  0, k  p
Identify the order of an AR process by using the PACF
48
MA(1)
yt  40   t  0.8 t 1
MA(2)
yt  40   t  0.7 t 1  0.28 t 2
Decay pattern
Decay pattern
AR(1)
yt  8  0.8 yt 1   t
AR(1)
yt  8  0.8 yt 1   t
Cuts off after
1st lag
AR(2)
AR(2)
yt  8  0.8 yt 1  0.5 yt 2   t
Cuts off after
2nd lag
49
Invertibility of MA models
Invertible moving average process:
The MA(q) process
k  Pk 1k
is invertible if it has an absolute summable infinite AR representation
It can be shown:
The infinite AR representation for MA(q)


i 1
i 1
yt    i yt i     t ,  i   
50
Obtain  i
1 B  B
1
2
2


q Bq 1  1B   2 B2    1
We need
 1  1  0
 2  1 1   2  0

 j  1 j 1     q j  q  0
 0  1 &  j  0, j  0
Condition of invertibility
The roots of the associated polynomial be less than 1 in absolute value
mq 1mq1 2mq2 q  0
An invertible MA(q) process can then
be written as an infinite AR process
51
PACF possibly never cuts off
PACF of a MA(q) process is a mixture of exponential decay
& damp sinusoid expressions
In model identification, use both sample ACF & sample PACF
52
Mixed Autoregressive –Moving Average (ARMA) Process
ARMA (p,q) model
yt    1 yt 1  2 yt  2     p yt  p   t  1 t 1   2 t  2     q t  q
p
q
i 1
i 1
    i yt i   t    i t i
Byt    B t ,  t white noise
Adjust the exponential decay pattern by adding a few terms
53
Stationarity of ARMA (p,q) process
Related to the AR component
ARMA(p,q) stationary if the roots of the polynomial less than one in absolute value
m p  1m p1  2m p2   p  0
ARMA(p,q) has an infinite MA representation

yt     i t i    B  t , B   B  B 
1
i 0
54
Invertibility of ARMA(p,q) process
Invertibility of ARMA process related to the MA component
Check through the roots of the polynomial
mq 1mq1 2mq2 q  0
If the roots less than 1 in absolute value then ARMA(p,q) is invertible & has
an infinite representation
  B  yt     t
  B 1 & B   B 1  B 
Coefficients:
 i 1 i1 2 i2 q iq  0,,iip1,, p
i
55
ARMA(1,1)
Sample ACF & PACF: exponential
decay behavior
56
57
58
59
Non Stationary Process
Not constant level, exhibit homogeneous behavior over time
yt is homogeneous, non stationary if
-It is not stationary
-Its first difference, wt=yt-yt-1=(1-B)yt or higher order differences wt=(1-B)dyt produce a
stationary time series
Yt autoregressive intergrated moving average of order p, d,q –ARIMA(p,d,q)
If the d difference , wt=(1-B)dyt produces a stationary ARMA(p,q) process
ARIMA(p,d,q)
B1  B yt    B t
d
60
The random walk process ARIMA(0,1,0)
Simplest non-stationary model
1  Byt     t
First differencing eliminates serial dependence & yields a white noise
process
61
yt=20+yt-1+et
Evidence of non-stationary process
-Sample ACF : dies out slowly
-Sample PACF: significant at the first lag
-Sample PACF value at lag 1 close to 1
First difference
-Time series plot of wt: stationary
-Sample ACF& PACF: do not show any
significant value
-Use ARIMA(0,1,0)
62
The random walk process ARIMA(0,1,1)
1  Byt    1 B t
Infinite AR representation, derived from:
 i i 1  10,,ii10 ,  i  1

yt      i yt i   t
i 1
   1    yt 1  yt 2     t
ARIMA(0,1,1)= (IMA(1,1)): expressed as an exponential weighted moving average
(EWMA) of all past values
63
ARIMA(0,1,1)
-The mean of the process is moving upwards in time
-Sample ACF: dies relatively slow
-Sample PACF: 2 significant values at lags 1& 2
Possible model :AR(2)
Check the roots
-First difference looks stationary
-Sample ACF & PACF: an MA(1)
model would be appropriate for
the first difference , its ACF cuts off
after the first lag & PACF decay
pattern
64
yt  2  0.95yt 1   t
65
66
```