4.2 Linear Stationary Models for Time Series

A stochastic process $ \left\{y_{t}\right\}_{t= -\infty}^{\infty}$ is a model that describes the probability structure of a sequence of observations over time. A time series $ y_{t} $ is a sample realization of a stochastic process that is observed only for a finite number of periods, indexed by $ t= 1, \dots, T$.

Any stochastic process can be partially characterized by the first and second moments of the joint probability distribution: the set of means, $ \mu_t \, =\textrm{E}y_t$, and the set of variances and covariances $ cov(y_t, y_s) \, =\,\textrm{E}(y_t - \mu_t) (y_s - \mu_s),
\, \forall t, \, s$. In order to get consistent forecast methods, we need that the underlying probabilistic structure would be stable over time. So a stochastic process is called weak stationary or covariance stationary when the mean, the variance and the covariance structure of the process is stable over time, that is:

$\displaystyle \textrm{E}\, y_t$ $\displaystyle =$ $\displaystyle \mu < \infty$ (4.1)
$\displaystyle E (y_t - \mu)^2$ $\displaystyle =$ $\displaystyle \gamma_0 < \infty$ (4.2)
$\displaystyle E (y_t - \mu) (y_s - \mu)$ $\displaystyle =$ $\displaystyle \gamma_{\vert t-s\vert} \qquad\quad \forall t, s
t \neq s$ (4.3)

Given condition (4.3), the covariance between $ y_t $ and $ y_s$ depends only on the displacement $ \vert t-s\vert=j$ and it is called autocovariance at lag $ j$, $ \gamma_j$. The set of autocovariances $ \gamma_j$, $ j =0, \pm 1, \pm 2, \dots$, is called the autocovariance function of a stationary process.

The general Autoregressive Moving Average model $ ARMA(p, q)$ is a linear stochastic model where the variable $ y_t $ is modelled in terms of its own past values and a disturbance. It is defined as follows:

$\displaystyle y_t$ $\displaystyle =$ $\displaystyle \delta + \phi_1 y_{t-1} + \phi_2 y_{t-2} + \dots + \phi_p y_{t-p} +
u_t$ (4.4)
$\displaystyle u_t$ $\displaystyle =$ $\displaystyle \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} + \dots +
\theta_q \varepsilon_{t-q}$  
$\displaystyle \varepsilon_t$ $\displaystyle \sim$ $\displaystyle i.i.d.(0, \sigma^2_{\varepsilon})$  

where the random variable $ \varepsilon_t $ is called the innovation because it represents the part of the observed variable $ y_t $ that is unpredictable given the past values $ y_{t-1}, y_{t-2}, \dots$.

The general $ ARMA$ model (4.4) assumes that $ y_t $ is the output of a linear filter that transforms the past innovations $ \varepsilon_{t-i}, i= 0, 1,
\dots, \infty$, that is, $ y_t $ is a linear process. This linearity assumption is based on the Wold's decomposition theorem (Wold; 1938) that says that any discrete stationary covariance process $ y_t $ can be expressed as the sum of two uncorrelated processes,

$\displaystyle y_t = d_t + u_t$ (4.5)

where $ d_t$ is purely deterministic and $ u_t$ is a purely indeterministic process that can be written as a linear sum of the innovation process $ \varepsilon_t $:

$\displaystyle u_t = \sum_{i=0}^\infty \psi_i \varepsilon_{t-i} \quad \hbox{with} \quad \psi_0 = 1, \qquad \sum_{i=0}^\infty \psi_i^2 < \infty$ (4.6)

where $ \varepsilon_t $ is a sequence of serially uncorrelated random variables with zero mean and common variance $ \sigma^2_{\varepsilon}$. Condition $ \sum_{i} \psi_i^2 < \infty$ is necessary for stationarity.

The $ ARMA(p, q)$ formulation (4.4) is a finite reparametrization of the infinite representation (4.5)-(4.6) with $ d_t$ constant. It is usually written in terms of the lag operator $ L$ defined by $ L^j y_t=y_{t-j}$, that gives a shorter expression:

$\displaystyle (1 - \phi_1 L - \dots - \phi_p L^p) y_t$ $\displaystyle =$ $\displaystyle \delta + (1 + \theta_1 L + \dots + \theta_q L^q) \varepsilon_t$  
$\displaystyle \Phi(L) y_t$ $\displaystyle =$ $\displaystyle \delta + \Theta(L) \varepsilon_t$ (4.7)

where the lag operator polynomials $ \Theta(L)$ and $ \Phi(L)$ are called the $ MA$ polynomial and the $ AR$ polynomial, respectively. In order to avoid parameter redundancy, we assume that there are not common factors between the $ AR$ and the $ MA$ components.

Next, we will study the plot of some time series generated by stationary $ ARMA$ models with the aim of determining the main patterns of their temporal evolution. Figure 4.2 includes two series generated from the following stationary processes computed by means of the genarma quantlet:

Series 1: $ y1_t = 1.4\, y1_{t-1} - 0.8\, y1_{t-2} + \varepsilon_t,$ $ \varepsilon_t
\sim N.I.D.(0,1) $
*[2mm] Series 2: $ y2_t = 0.9+ 0.7\, y2_{t-1}
+ 0.5 \varepsilon_{t-1} + \varepsilon_t,$ $ \varepsilon_t
\sim N.I.D.(0,1) $

Figure 4.2: Time series generated by $ ARMA$ models
\includegraphics[width=0.7\defpicwidth]{genseries1.ps} \includegraphics[width=0.7\defpicwidth]{genseries2.ps}

As expected, both time series move around a constant level without changes in variance due to the stationary property. Moreover, this level is close to the theoretical mean of the process, $ \mu$, and the distance of each point to this value is very rarely outside the bounds $ \pm 2 \sigma$. Furthermore, the evolution of the series shows local departures from the mean of the process, which is known as the mean reversion behavior that characterizes the stationary time series.

Let us study with some detail the properties of the different $ ARMA$ processes, in particular, the autocovariance function which captures the dynamic properties of a stochastic stationary process. This function depends on the units of measure, so the usual measure of the degree of linearity between variables is the correlation coefficient. In the case of stationary processes, the autocorrelation coefficient at lag $ j$, denoted by $ \rho_j$, is defined as the correlation between $ y_t $ and $ y_{t-j}$:

$\displaystyle \rho_{j} = \frac{cov(y_t,y_{t-j})}{\sqrt{V(y_t)}\sqrt{V(y_{t-j})}}\, =\,
\frac{\gamma_{j}}{\gamma_0}, \qquad j = 0, \pm 1, \pm 2, \dots

Thus, the autocorrelation function (ACF) is the autocovariance function standarized by the variance $ \gamma_0$. The properties of the ACF are:

$\displaystyle \rho_0$ $\displaystyle =$ $\displaystyle 1$ (4.8)
$\displaystyle \vert\rho_j\vert$ $\displaystyle \leq$ $\displaystyle 1$ (4.9)
$\displaystyle \rho_j$ $\displaystyle =$ $\displaystyle \rho_{-j}$ (4.10)

Given the symmetry property (4.10), the ACF is usually represented by means of a bar graph at the nonnegative lags that is called the simple correlogram.

Another useful tool to describe the dynamics of a stationary process is the partial autocorrelation function (PACF). The partial autocorrelation coefficient at lag $ j$ measures the linear association between $ y_t $ and $ y_{t-j}$ adjusted for the effects of the intermediate values $ y_{t-1}, \dots, y_{t-j+1}$. Therefore, it is just the coefficient $ \phi_{jj}$ in the linear regression model:

$\displaystyle y_t = \alpha + \phi_{j1} y_{t-1} + \phi_{j2} y_{t-2} + \dots +\phi_{jj} y_{t-j} + e_t$ (4.11)

The properties of the PACF are equivalent to those of the ACF (4.8)-(4.10) and it is easy to prove that $ \phi_{11} =
\rho_1$ (Box and Jenkins; 1976). Like the ACF, the partial autocorrelation function does not depend on the units of measure and it is represented by means of a bar graph at the nonnegative lags that is called partial correlogram.

The dynamic properties of each stationary model determine a particular shape of the correlograms. Moreover, it can be shown that, for any stationary process, both functions, ACF and PACF, approach to zero as the lag $ j$ tends to infinity. The $ ARMA$ models are not always stationary processes, so it is necessary first to determine the conditions for stationarity. There are subclasses of $ ARMA$ models which have special properties so we shall study them separately. Thus, when $ p=q=0$ and $ \delta = 0$, it is a white noise process, when $ p= 0$, it is a pure moving average process of order $ q$, $ MA(q)$, and when $ q=0$ it is a pure autoregressive process of order $ p$, $ AR(p)$.

4.2.1 White Noise Process

The simplest $ ARMA$ model is a white noise process, where $ y_t $ is a sequence of uncorrelated zero mean variables with constant variance $ \sigma ^2$. It is denoted by $ y_t \sim WN(0, \sigma^2)$. This process is stationary if its variance is finite, $ \sigma^2< \infty$, since given that:

E y_t &=& 0 &\qquad& \forall t\\
...forall t\\
Cov(y_t, y_s) &=& 0 & & \forall t\neq s

$ y_t $ verifies conditions (4.1)-(4.3). Moreover, $ y_t $ is uncorrelated over time, so its autocovariance function is:

$\displaystyle \gamma_j$ $\displaystyle =$ $\displaystyle \left\{\begin{array}{lll}
\sigma^2 && j = 0\\
0 && j\neq1

And its ACF and PACF are as follows:

$ \rho_j \, = \, \left\{\begin{array}{lll}
1 && j = 0\\
0 && j\neq1
\end{array}\right. $          $ \phi_{jj} \, = \, \left\{\begin{array}{lll}
1 && j = 0\\
0 && j\neq1
\end{array}\right. $

To understand the behavior of a white noise, we will generate a time series of size 150 from a gaussian white noise process $ y_t
\sim N.I.D.(0,1)$. Figure 4.3 shows the simulated series that moves around a constant level randomly, without any kind of pattern, as corresponds to the uncorrelation over time. The economic time series will follow white noise patterns very rarely, but this process is the key for the formulation of more complex models. In fact, it is the starting point of the derivation of the properties of $ ARMA$ processes given that we are assuming that the innovation of the model is a white noise.

Figure 4.3: Realization from a white noise process

4.2.2 Moving Average Model

The general (finite-order) moving average model of order $ q$, $ MA(q)$ is:

$\displaystyle y_t$ $\displaystyle =$ $\displaystyle \delta + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \dots +
\theta_q \varepsilon_{t-q}$ (4.12)
$\displaystyle y_t$ $\displaystyle =$ $\displaystyle \delta + \Theta(L) \varepsilon_t,
\varepsilon_t \,\sim \, \hbox{WN}(0, \sigma^2_{\varepsilon})$  

It can be easily shown that $ MA$ processes are always stationary, given that the parameters of any finite $ MA$ processes always verify condition (4.6). Moreover, we are interested in invertible $ MA$ processes. When a process is invertible, it is possible to invert the process, that is, to express the current value of the variable $ y_t $ in terms of a current shock $ \varepsilon_t $ and its observable past values $ y_{t-1}, y_{t-2}, \dots$. Then, we say that the model has an autoregressive representation. This requirement provides a sensible way of associating present events with past happenings. A $ MA(q)$ model is invertible if the $ q$ roots of the characteristic equation $ \Theta(L) = 0$ lie outside the unit circle. When the root $ R_j$ is real, this condition means that the absolute value must be greater than unity, $ \vert R_j\vert >1$. If there are a pair of complex roots, they may be written as $ R_j = a
\pm b i$, where $ a, b$ are real numbers and $ i=\sqrt{-1}$, and then the invertibility condition means that its moduli must be greater than unity, $ \sqrt{a^2 + b^2} >1$.

Let us consider the moving average process of first order, $ MA(1)$:

$\displaystyle y_t$ $\displaystyle =$ $\displaystyle \delta + \varepsilon_t + \theta \varepsilon_{t-1}, \qquad \varepsilon_t \sim \, WN(0, \sigma^2_{\varepsilon})$  
$\displaystyle y_t$ $\displaystyle =$ $\displaystyle \delta + (1 + \theta L) \varepsilon_t$  

It is invertible when the root of $ 1+\theta L = 0$ lies outside the unit circle, that is, $ \vert R\vert = \vert-1/\theta\vert>1$. This condition implies the invertibility restriction on the parameter, $ -1<\theta <1$.

Let us study this simple $ MA$ process in detail. Figure 4.4 plots simulated series of length 150 from two $ MA(1)$ processes where the parameters $ (\delta, \theta)$ take the values (0, 0.8) in the first model and (4, -0.5) in the second one. It can be noted that the series show the general patterns associated with stationary and mean reversion processes. More specifically, given that only a past innovation $ \varepsilon_{t-1}$ affects the current value of the series $ y_t $ (positively for $ \theta >0$ and negatively for $ \theta<0$), the $ MA(1)$ process is known as a very short memory process and so, there is not a 'strong' dynamic pattern in the series. Nevertheless, it can be observed that the time evolution is smoother for the positive value of $ \theta$.

Figure 4.4: Realizations of $ MA(1)$ models with $ \varepsilon _{t}\sim N.I.D.(0,1)$
\includegraphics[width=0.7\defpicwidth]{ma18.ps} \includegraphics[width=0.7\defpicwidth]{ma1-5.ps}

The ACF for $ MA(1)$ models is derived from the following moments:

E y_t &=& \textrm{E}(\delta + \varepsil...
\varepsilon_{t-j-1}) & =&0 \qquad \forall j >1

given that, for all $ j>1$ and for all $ t$, the innovations $ \varepsilon_t,
\varepsilon_{t-1}$ are uncorrelated with $ \varepsilon_{t-j},
\varepsilon_{t-j-1}$. Then, the autocorrelation function is:

$\displaystyle \rho_j$ $\displaystyle =$ $\displaystyle \left\{\begin{array}{cll}
\displaystyle\frac{\theta}{1 + \theta^2} && j = 1\\ *[2mm]
0 && j > 1

Figure 4.5: Population ACF and PACF for $ MA(1)$
\includegraphics[width=0.7\defpicwidth]{ma8s.ps} \includegraphics[width=0.7\defpicwidth]{ma8p.ps} \includegraphics[width=0.7\defpicwidth]{ma-5s.ps} \includegraphics[width=0.7\defpicwidth]{ma-5p.ps}

That is, there is a cutoff in the ACF at the first lag. Finally, the partial autocorrelation function shows an exponential decay. Figure 4.5 shows typical profiles of this ACF jointly with the PACF.

Figure 4.6: Population ACF and PACF for $ MA(2)$ processes
\includegraphics[width=0.7\defpicwidth]{ma2-1s.ps} \includegraphics[width=0.7\defpicwidth]{ma2-1p.ps} \includegraphics[width=0.7\defpicwidth]{ma2-2s.ps} \includegraphics[width=0.7\defpicwidth]{ma2-2p.ps}

It can be shown that the general stationary and invertible $ MA(q)$ process has the following properties (Box and Jenkins; 1976):

Figure 4.6 shows the simple and partial correlograms for two different $ MA(2)$ processes. Both ACF exhibit a cutoff at lag two. The roots of the $ MA$ polynomial of the first series are real, so the PACF decays exponentially while for the second series with complex roots the PACF decays as a damping sine-cosine wave.

4.2.3 Autoregressive Model

The general (finite-order) autoregressive model of order $ p$, $ AR(p)$, is:

$\displaystyle y_t$ $\displaystyle =$ $\displaystyle \delta + \phi_1 y_{t-1} + \dots +
\phi_p y_{t-p} + \varepsilon_t$ (4.13)
$\displaystyle \Phi(L) y_t$ $\displaystyle =$ $\displaystyle \delta + \varepsilon_t,
\hspace*{3.5cm} \varepsilon_t \sim \hbox{WN}(0, \sigma^2_{\varepsilon})$  

Let us begin with the simplest $ AR$ process, the autoregressive process of first order, $ AR(1)$, that is defined as:

$\displaystyle y_t$ $\displaystyle =$ $\displaystyle \delta + \phi \,y_{t-1} + \varepsilon_t$ (4.14)
$\displaystyle (1 - \phi L)\, y_t$ $\displaystyle =$ $\displaystyle \delta + \varepsilon_t,
\hspace*{1.5cm} \varepsilon_t \sim \, WN(0, \sigma^2_{\varepsilon})$  

Figure 4.7: Realizations of $ AR(1)$ models with $ \varepsilon _{t}\sim N.I.D.(0,1)$
\includegraphics[width=0.7\defpicwidth]{ar17.ps} \includegraphics[width=0.7\defpicwidth]{ar1-7.ps}

Figure 4.7 shows two simulated time series generated from $ AR(1)$ processes with zero mean and parameters $ \phi=0.7$ and -0.7, respectively. The autoregressive parameter measures the persistence of past events into the current values. For example, if $ \phi>0$, a positive (or negative) shock $ \varepsilon_t $ affects positively (or negatively) for a period of time which is longer the larger the value of $ \phi$. When $ \phi
<0$, the series moves more roughly around the mean due to the alternation in the direction of the effect of $ \varepsilon_t $, that is, a shock that affects positively in moment $ t$, has negative effects on $ t+1$, positive in $ t+2$, ...

The $ AR(1)$ process is always invertible and it is stationary when the parameter of the model is constrained to lie in the region $ -1<\phi <1$. To prove the stationary condition, first we write the $ y_t $ in the moving average form by recursive substitution of $ y_{t-i}$ in (4.14):

$\displaystyle y_t =\delta \sum_{i=0}^{\infty} \phi^i + \sum_{i=0}^{\infty} \phi^i \varepsilon_{t-i}$ (4.15)

Figure 4.8: Population correlograms for $ AR(1)$ processes
\includegraphics[width=0.7\defpicwidth]{ar7s.ps} \includegraphics[width=0.7\defpicwidth]{ar7p.ps} \includegraphics[width=0.7\defpicwidth]{ar-7s.ps} \includegraphics[width=0.7\defpicwidth]{ar-7p.ps}

That is, $ y_t $ is a weighted sum of past innovations. The weights depend on the value of the parameter $ \phi$: when $ \vert\phi\vert>1$, (or $ \vert\phi\vert
<1$), the influence of a given innovation $ \varepsilon_t $ increases (or decreases) through time. Taking expectations to (4.15) in order to compute the mean of the process, we get:

$\displaystyle \textrm{E}\, y_t = \delta \sum_{i=0}^{\infty} \phi^i + \sum_{i=0}^{\infty}
\phi^i\textrm{E}\, \varepsilon_{t-i}

Given that $ E \varepsilon_{t-i}= 0$, the result is a sum of infinite terms that converges for all value of $ \delta$ only if $ \vert\phi\vert
<1$, in which case $ E
y_t = \delta (1-\phi)^{-1}$. A similar problem appears when we compute the second moment. The proof can be simplified assuming that $ \delta = 0$, that is, $ E y_t =0$. Then, variance is:

$\displaystyle V(y_t)$ $\displaystyle =$ $\displaystyle \textrm{E}\left(\sum_{i=0}^{\infty} \phi^i
\varepsilon_{t-i} \right)^2$  
$\displaystyle *[2mm]$ $\displaystyle =$ $\displaystyle \sum_{i=0}^{\infty} \phi^{2i} V(\varepsilon_{t-i}) \, =\,
\sigma^2_{\varepsilon} \sum_{i=1}^{\infty}\phi^{2i}$  

Again, the variance goes to infinity except for $ -1<\phi <1$, in which case $ V(y_t)= \sigma^2_{\varepsilon} (1-\phi^2)^{-1}$. It is easy to verify that both the mean and the variance explode when that condition doesn't hold.

The autocovariance function of a stationary $ AR(1)$ process is

$\displaystyle \gamma_j = \textrm{E}\left\{(\phi y_{t-1} + \varepsilon_t)y_{t-j}...
...ma^2_{\varepsilon} \displaystyle
(1-\phi^2)^{-1} \; \phi^j
\qquad \forall j >0 $

Therefore, the autocorrelation function for the stationary $ AR(1)$ model is:

$\displaystyle \rho_j = \frac{\phi \gamma_{j-1}}{\gamma_0} = \phi \rho_{j-1} = \phi^j
\qquad \forall j $

That is, the correlogram shows an exponential decay with positive values always if $ \phi$ is positive and with negative-positive oscillations if $ \phi$ is negative (see figure 4.8). Furthermore, the rate of decay decreases as $ \phi$ increases, so the greater the value of $ \phi$ the stronger the dynamic correlation in the process. Finally, there is a cutoff in the partial autocorrelation function at the first lag.

Figure 4.9: Population correlograms for $ AR(2)$ processes
\includegraphics[width=0.7\defpicwidth]{ar2-1s.ps} \includegraphics[width=0.7\defpicwidth]{ar2-1p.ps} \includegraphics[width=0.7\defpicwidth]{ar2-2s.ps} \includegraphics[width=0.7\defpicwidth]{ar2-2p.ps} \includegraphics[width=0.7\defpicwidth]{ar2-3s.ps} \includegraphics[width=0.7\defpicwidth]{ar2-3p.ps}

It can be shown that the general $ AR(p)$ process (Box and Jenkins; 1976):

Some examples of correlograms for more complex $ AR$ models, such as the $ AR(2)$, can be seen in figure 4.9. They are very similar to the $ AR(1)$ patterns when the processes have real roots, but take a very different shape when the roots are complex (see the first pair of graphics of figure 4.9).

4.2.4 Autoregressive Moving Average Model

The general (finite-order) autoregressive moving average model of orders $ (p,q)$, $ ARMA(p, q)$, is:

$\displaystyle y_t$ $\displaystyle =$ $\displaystyle \delta + \phi_1 y_{t-1} + \dots + \phi_p y_{t-p} +
\varepsilon_t + \theta_1 \varepsilon_{t-1} + \dots + \theta_q \varepsilon_{t-q}$  
$\displaystyle \Phi(L) y_t$ $\displaystyle =$ $\displaystyle \delta + \Theta(L) \varepsilon_t,
\varepsilon_t \,\sim \, \hbox{WN}(0, \sigma^2_{\varepsilon})$  

It can be shown that the general $ ARMA(p, q)$ process (Box and Jenkins; 1976):

For example, the $ ARMA(1,1)$ process is defined as:

$\displaystyle y_t$ $\displaystyle =$ $\displaystyle \delta + \phi y_{t-1} + \theta \varepsilon_{t-1} + \varepsilon_t$  
$\displaystyle (1 - \phi L)\, y_t$ $\displaystyle =$ $\displaystyle \delta + (1 + \theta L) \varepsilon_t, \qquad
\qquad \varepsilon_t
\sim \, WN(0, \sigma^2_{\varepsilon})$  

This model is stationary if $ \vert\phi\vert
<1$ and is invertible if $ \vert\theta\vert<1$. The mean of the $ ARMA(1,1)$ stationary process can be derived as follows:

$\displaystyle E y_t = \delta + \phi\textrm{E}y_{t-1} + \theta\textrm{E}\varepsilon_{t-1} +

by stationarity $ E y_t =\textrm{E}y_{t-1} =\mu$, and so $ \displaystyle
\mu = \displaystyle\frac{\delta}{1-\phi} $.

Figure 4.10: Population correlograms for $ ARMA(1,1)$ processes
\includegraphics[width=0.7\defpicwidth]{arma11-1s.ps} \includegraphics[width=0.7\defpicwidth]{arma11-1p.ps} \includegraphics[width=0.7\defpicwidth]{arma11-2s.ps} \includegraphics[width=0.7\defpicwidth]{arma11-2p.ps}

The autovariance function for an $ ARMA(1,1)$ stationary process (assuming $ \delta = 0$) is as follows:

\gamma_0 &=& \textrm{E}(\phi y_{t-1} + ...
...j}\right\} & =&\phi \gamma_{j-1} \qquad \forall

The autocorrelation function for the stationary $ ARMA(1,1)$ model is:

$\displaystyle \rho_j =
\phi + \displayst...
...a^2 + 2\theta \phi}
&& j=1
\\ *[2mm]
\phi \rho_{j-1} && j>1

Figure 4.10 shows typical profiles of the ACF and PACF for $ ARMA(1,1)$ stationary e invertible processes.