Time Series 3—ARMA
The autoregressive moving average (ARMA) model
\[\begin{eqnarray*} y_{t}&=&a_{0}+\sum_{i=1}^{p}a_{i}y_{t-i}+\sum_{i=0}^{q}\beta_{i}\varepsilon_{t-i} \end{eqnarray*}\]Stationarity
A stochastic process is covariance stationary or weakly stationary if for all \(t\) and \(s\)
\[\begin{eqnarray*} E(y_{t})&=&E(y_{t-s})=\mu\\ E\left[(y_{t}-\mu)^{2}\right]&=&E\left[(y_{t-s}-\mu)^{2}\right]=\sigma_{y}^{2}=\gamma_{0}\\ E\left[(y_{t}-\mu)(y_{t-s}-\mu)\right]&=&E\left[(y_{t-j}-\mu)(y_{t-j-s}-\mu)\right]=\gamma_{s} \end{eqnarray*}\]If a process is covariance stationary, the covariance between \(y_{t}\) and \(y_{t-s}\) depends only on \(s\), the length of time separating the observations. It follows that for a covariance stationary process, \(\gamma_{s}\) and \(\gamma_{-s}\) would represent the same magnitude.
For a covariance stationary series, we can define the autocorrelation between \(y_{t}\) and \(y_{t-s}\)
\[\begin{eqnarray*} \rho_{s}&=&\frac{\gamma_{s}}{\gamma_{0}} \end{eqnarray*}\]where \(\gamma_{0}\) is the variance of \(y_{t}\)
Ergodicity
Imagine a battery of \(I\) computers generating sequences \(\{y_{t}^{(1)}\}_{t=-\infty}^{\infty}\), \(\{y_{t}^{(2)}\}_{t=-\infty}^{\infty}\), \(\dots\), \(\{y_{t}^{(I)}\}_{t=-\infty}^{\infty}\) and consider selecting the observation associated with date \(t\) from each sequence: \(\{y_{t}^{(1)},y_{t}^{(2)},\dots,y_{t}^{(I)}\}\) This would be described as a sample of \(I\) realizations of the random variable \(Y_{t}\). The expectation of the \(t\)th observation of a time series refers to the mean of the probability distribution
\[\begin{eqnarray*} E(Y_{t})&=&\int_{-\infty}^{\infty}y_{t}f_{Y_{t}}(y_{t})dy_{t} \end{eqnarray*}\]We might view this as the probability limit of the ensemble average
\[\begin{eqnarray*} E(Y_{t})&=&\mathop{plim}_{I\to\infty}\frac{1}{I}\sum_{i=1}^{I}Y_{t}^{(i)} \end{eqnarray*}\]The above expectations of a time series in terms of ensemble averages may seem a bit contrived. Usually we have a single realization of size \(T\) from the process \(\{y_{1}^{(1)},y_{2}^{(1)},\dots,y_{T}^{(1)}\}\) From these observations we would calculate the sample mean \(\bar{y}\), which is a time average
\[\begin{eqnarray*} \bar{y}&=&\frac{1}{T}\sum_{t=1}^{T}y_{t}^{(1)} \end{eqnarray*}\]A covariance stationary process is said to be ergodic for the mean if \(\bar{y}\) converges in probability to \(E(Y_{t})\) as \(T\to\infty\).
Moving Average Processes
The First-Order Moving Average Process
Let \(\{\varepsilon_{t}\}\) be white noise and consider the process
\[\begin{eqnarray*} y_{t}&=&\mu+\varepsilon_{t}+\theta\varepsilon_{t-1} \end{eqnarray*}\]where \(\mu\) and \(\theta\) could be any constants. This time series is called a first-order moving average process, denoted \(MA(1)\).
Expectation The expectation of \(y_{t}\) is
\[\begin{eqnarray*} E(y_{t})&=&E(\mu+\varepsilon_{t}+\theta\varepsilon_{t-1})=\mu+E(\varepsilon)+\theta E(\varepsilon_{t-1})=\mu \end{eqnarray*}\]Variance The variance of \(y_{t}\) is
\[\begin{eqnarray*} E(y_{t}-\mu)^{2}&=&E(\varepsilon_{t}+\theta\varepsilon_{t-1})^{2}\\ &=&E\left(\varepsilon_{t}^{2}+2\theta\varepsilon_{t}\varepsilon_{t-1}+\theta^{2}\varepsilon_{t-1}^{2}\right)\\ &=&\sigma^{2}+0+\theta^{2}\sigma^{2}\\ &=&(1+\theta^{2})\sigma^{2} \end{eqnarray*}\]Autocovariance The first autocovariance of \(y_{t}\) is
\[\begin{eqnarray*} E(y_{t}-\mu)(y_{t-1}-\mu)&=&E(\varepsilon_{t}+\theta\varepsilon_{t-1})(\varepsilon_{t-1}+\theta\varepsilon_{t-2})\\ &=&E\left(\varepsilon_{t}\varepsilon_{t-1}+\theta\varepsilon_{t-1}^{2}+\theta\varepsilon_{t}\varepsilon_{t-2}+\theta^{2}\varepsilon_{t-1}\varepsilon_{t-2}\right)\\ &=&0+\theta\sigma^{2}+0+0\\ &=&\theta^{2}\sigma^{2} \end{eqnarray*}\]Higher autocovariances are all zero. For all \(j>1\)
\[\begin{eqnarray*} E(y_{t}-\mu)(y_{t-j}-\mu)&=&0 \end{eqnarray*}\]Autocorrelation The \(j\)th autocorrelation of a covariance stationary process is \(\rho_{j}=\frac{\gamma_{j}}{\gamma_{0}}\)
\[\begin{eqnarray*} \rho_{1}&=&\frac{\gamma_{1}}{\gamma_{0}}=\frac{\theta^{2}\sigma^{2}}{(1+\theta^{2})\sigma^{2}}=\frac{\theta^{2}}{1+\theta^{2}}\\ \rho_{j}&=&0, j>1 \end{eqnarray*}\]The \(q\)th-Order Moving Average Process
A \(q\)th-order moving average process, denoted \(MA(q)\), is characterized by
\[\begin{eqnarray*} y_{t}&=&\mu+\varepsilon_{t}+\theta_{1}\varepsilon_{t-1}+\theta_{2}\varepsilon_{t-2}+\cdots+\theta_{q}\varepsilon_{t-q} \end{eqnarray*}\]Expectation The expectation of \(y_{t}\) is
\[\begin{eqnarray*} E(y_{t})&=&E(\mu)+E(\varepsilon_{t})+\theta_{1}E(\varepsilon_{t-1})+\theta_{2}E(\varepsilon_{t-2})+\cdots+\theta_{q}E(\varepsilon_{t-q})=\mu \end{eqnarray*}\]Variance The variance of \(y_{t}\) is
\[\begin{eqnarray*} E(y_{t}-\mu)^{2}&=&E(\varepsilon_{t}+\theta_{1}\varepsilon_{t-1}+\theta_{2}\varepsilon_{t-2}+\cdots+\theta_{q}\varepsilon_{t-q})^{2}\\ &=&E(\varepsilon_{t})^{2}+E(\theta_{1}\varepsilon_{t-1})^{2}+E(\theta_{2}\varepsilon_{t-2})^{2}+\cdots+E(\theta_{q}\varepsilon_{t-q})^{2}\\ &=&\sigma^{2}+\theta_{1}^{2}\sigma^{2}+\theta_{2}^{2}\sigma^{2}+\cdot+\theta_{q}^{2}\sigma^{2}\\ &=&(1+\theta_{1}^{2}+\theta_{2}^{2}+\cdot+\theta_{q}^{2})\sigma^{2} \end{eqnarray*}\]Autocovariance The autocovariance of \(y_{t}\) is
\[\begin{eqnarray*} E(y_{t}-\mu)(y_{t-j}-\mu)&=&E(\varepsilon_{t}+\theta_{1}\varepsilon_{t-1}+\theta_{2}\varepsilon_{t-2}+\cdots+\theta_{q}\varepsilon_{t-q})\\ &&(\varepsilon_{t-j}+\theta_{1}\varepsilon_{t-j-1}+\theta_{2}\varepsilon_{t-j-2}+\cdots+\theta_{q}\varepsilon_{t-j-q})\\ &=&E\left(\theta_{j}\varepsilon_{t-j}^{2}+\theta_{j+1}\theta_{1}\varepsilon_{t-j-1}^{2}+\theta_{j+2}\theta_{2}\varepsilon_{t-j-2}^{2}+\cdots+\theta_{q}\theta_{q-j}\varepsilon_{t-q}^{2}\right)\\ &=&\left(\theta_{j}+\theta_{j+1}\theta_{1}+\theta_{j+2}\theta_{2}+\cdots+\theta_{q}\theta_{q-j}\right)\sigma^{2}, j=1,2,\dots,q \end{eqnarray*}\]For all \(j>q\)
\[\begin{eqnarray*} E(y_{t}-\mu)(y_{t-j}-\mu)&=&0 \end{eqnarray*}\]The Infinite-Order Moving Average Process
Consider the process when \(q\to\infty\)
\[\begin{eqnarray*} y_{t}&=&\mu+\sum_{j=0}^{\infty}\psi_{j}\varepsilon_{t-j}=\mu+\psi_{0}\varepsilon_{t}+\psi_{1}\varepsilon_{t-1}+\psi_{2}\varepsilon_{t-2}+\cdots \end{eqnarray*}\]This could be described as an \(MA(\infty)\) process.
The \(MA(\infty)\) process is covariance stationary if it is square summable
\[\begin{eqnarray*} \sum_{j=0}^{\infty}\psi_{j}^{2}&<&\infty \end{eqnarray*}\]It is often to work with a slightly stronger condition called absolutely summable
\[\begin{eqnarray*} \sum_{j=0}^{\infty}|\psi_{j}|&<&\infty \end{eqnarray*}\]Expectation The mean of an \(MA(\infty)\) process with absolutely summable is
\[\begin{eqnarray*} E(y_{t})&=&\lim_{T\infty}E(\mu+\psi_{0}\varepsilon_{t}+\psi_{1}\varepsilon_{t-1}+\psi_{2}\varepsilon_{t-2}+\cdots+\psi_{T}\varepsilon_{t-T})=\mu \end{eqnarray*}\]Autocovariance The autocovariance of an \(MA(\infty)\) process with absolutely summable is
\[\begin{eqnarray*} \gamma_{0}&=&E(y_{t}-\mu)^{2}\\ &=&\lim_{T\infty}E(\psi_{0}\varepsilon_{t}+\psi_{1}\varepsilon_{t-1}+\psi_{2}\varepsilon_{t-2}+\cdots+\psi_{T}\varepsilon_{t-T})^{2}\\ &=&\lim_{T\to\infty}(\psi_{0}^{2}+\psi_{1}^{2}+\psi_{2}^{2}+\cdots+\psi_{T}^{2})\sigma^{2}\\ \gamma_{j}&=&E(y_{t}-\mu)(y_{t-j}-\mu)\\ &=&(\psi_{j}\psi_{0}+\psi_{j+1}\psi_{1}+\psi_{j+2}\psi_{2}+\psi_{j+3}\psi_{3}+\cdots)\sigma^{2} \end{eqnarray*}\]Autoregressive Processes
The First-Order Autoregressive Process
A first-order autoregressive, denoted \(AR(1)\), satisfies the following difference equation
\[\begin{eqnarray*} y_{t}&=&c+\phi y_{t-1}+\varepsilon_{t} \end{eqnarray*}\]When $$ | \phi | <1$$, this process is covariance stationary. It can be rewritten as |
We can derive the expectation and autocovariance of \(AR(1)\) by the above corresponding \(MA(\infty)\) process. We also can derive them by assuming \(AR(1)\) process is covariance stationary.
Expectation Taking expectations both sides
\[\begin{eqnarray*} E(y_{t})&=&c+\phi E(y_{t-1})+E(\varepsilon_{t})\\ \mu&=&c+\phi\mu\\ \mu&=&\frac{c}{1-\phi} \end{eqnarray*}\]Autocovariance
The \(q\)th-Order Autoregressive Process
The Autocorrelation Function
Autocorrelation function (ACF) and the partial autocorrelation function (PACF) are useful to determine the type of time series data.
For AR(1) model
- Method 1
So the autocorrelation function for AR(1)
\[\begin{eqnarray*} \rho_{s}&=&\frac{\gamma_{s}}{\gamma_{0}}\\ &=&a_{1}^{s} \end{eqnarray*}\]- Method 2 If the process started at time zero
Take the expectation of \(y_{t}\) and \(y_{t+s}\)
\[\begin{eqnarray*} E(y_{t})&=&a_{0}\sum_{i=0}^{t-1}a_{1}^{i}+a_{1}^{t}y_{0}\\ E(y_{t+s})&=&a_{0}\sum_{i=0}^{t+s-1}a_{1}^{i}+a_{1}^{t+s}y_{0} \end{eqnarray*}\]If $$ | a_{1} | <1\(, as\)t\rightarrow \infty$$ |
The Autocorrelation Function of an AR(2) Process
We assume that \(a_{0}=0\), which implies that \(E(y_{t})=0\). Adding or subtracting any constant from a variable does not change its variance, covariance, correlation coefficient, etc.
Using Yule-Walker equations: multiply the second-order D.E by \(y_{t-s}\) for s\(=0, 1, 2, \cdots\), and take expectations
\[\begin{eqnarray*} Ey_{t}y_{t}&=&a_{1}Ey_{t-1}y_{t}+a_{2}Ey_{t-2}y_{t}+E\varepsilon_{t}y_{t}\\ Ey_{t}y_{t-1}&=&a_{1}Ey_{t-1}y_{t-1}+a_{2}Ey_{t-2}y_{t-1}+E\varepsilon_{t}y_{t-1}\\ Ey_{t}y_{t-2}&=&a_{1}Ey_{t-1}y_{t-2}+a_{2}Ey_{t-2}y_{t-2}+E\varepsilon_{t}y_{t-2}\\ &\vdots&\\ Ey_{t}y_{t-s}&=&a_{1}Ey_{t-1}y_{t-s}+a_{2}Ey_{t-2}y_{t-s}+E\varepsilon_{t}y_{t-s}\\ \end{eqnarray*}\]By definition, the autocovariances of a stationary series are such
\[\begin{eqnarray*} Ey_{t}y_{t-s}&=&Ey_{t-s}y_{t}=Ey_{t-k}y_{t-k-s}=\gamma_{s} \end{eqnarray*}\]We also know that coefficient on \(\varepsilon_{t}\) is unity so that \(E\varepsilon_{t}y_{t}=\sigma^{2}\), and \(E\varepsilon_{t}y_{t-s}=0\), so
\[\begin{eqnarray*} \gamma_{0}&=&a_{1}\gamma_{1}+a_{2}\gamma_{2}+\sigma^{2}\\ \gamma_{1}&=&a_{1}\gamma_{0}+a_{2}\gamma_{1}\\ \gamma_{2}&=&a_{1}\gamma_{1}+a_{2}\gamma_{0}\\ &\vdots&\\ \gamma_{s}&=&a_{1}\gamma_{s-1}+a_{2}\gamma_{s-2} \end{eqnarray*}\]Now we can get the ACF
\[\begin{eqnarray*} \rho_{1}&=&a_{1}\rho_{0}+a_{2}\rho_{1}\\ \rho_{s}&=&a_{1}\rho_{s-1}+a_{2}\rho_{s-2} \end{eqnarray*}\]We know \(\rho_{0}=1\), so
\[\begin{eqnarray*} \rho_{1}&=&a_{1}+a_{2}\rho_{1}\\ \rho_{1}&=&\frac{a_{1}}{1-a_{2}} \end{eqnarray*}\]The Autocorrelation Function of an MA(1) Process
Consider the MA(1) process \(y_{t}=\varepsilon_{t}+\beta\varepsilon_{t-1}\)
Applying the Yule-Walker equations
\[\begin{eqnarray*} \gamma_{0}&=&E(y_{t}y_{t})=E\left[ (\varepsilon_{t}+\beta\varepsilon_{t-1})(\varepsilon_{t}+\beta\varepsilon_{t-1})\right]=(1+\beta^{2})\sigma^{2}\\ \gamma_{1}&=&E(y_{t}y_{t-1})=E\left[ (\varepsilon_{t}+\beta\varepsilon_{t-1})(\varepsilon_{t-1}+\beta\varepsilon_{t-2})\right]=\beta^{2}\sigma^{2}\\ \gamma_{2}&=&E(y_{t}y_{t-2})=E\left[ (\varepsilon_{t}+\beta\varepsilon_{t-1})(\varepsilon_{t-2}+\beta\varepsilon_{t-3})\right]=0\\ \gamma_{s}&=&E(y_{t}y_{t-s})=E\left[ (\varepsilon_{t}+\beta\varepsilon_{t-1})(\varepsilon_{t-s}+\beta\varepsilon_{t-s-1})\right]=0 \ \ \forall t>2 \end{eqnarray*}\]So the ACF of MA(1)
\[\begin{eqnarray*} \rho_{0}&=&1\\ \rho_{1}&=&\frac{\gamma_{1}}{\gamma_{0}}=\frac{\beta^{2}}{1+\beta^{2}}\\ \rho_{s}&=&0 \ \ \forall t>1 \end{eqnarray*}\]The Autocorrelation Function of an ARMA(1,1) Process Consider the ARMA(1,1) \(y_{t}=a_{1}y_{t-1}+\varepsilon_{t}+\beta_{1}\varepsilon_{t-1}\)
\[\begin{eqnarray*} Ey_{t}y_{t}&=&a_{1}Ey_{t-1}y_{t}+E\varepsilon_{t}y_{t}+\beta_{1}E\varepsilon_{t-1}y_{t} \Rightarrow \\ \gamma_{0}&=&a_{1}\gamma_{1}+\sigma^{2}+\beta_{1}(a_{1}+\beta_{1})\sigma^{2}\\ Ey_{t}y_{t-1}&=&a_{1}Ey_{t-1}y_{t-1}+E\varepsilon_{t}y_{t-1}+\beta_{1}E\varepsilon_{t-1}y_{t-1} \Rightarrow \\ \gamma_{1}&=&a_{1}\gamma_{0}+\beta_{1}\sigma^{2}\\ Ey_{t}y_{t-2}&=&a_{1}Ey_{t-1}y_{t-2}+E\varepsilon_{t}y_{t-2}+\beta_{1}E\varepsilon_{t-1}y_{t-2} \Rightarrow \\ \gamma_{2}&=&a_{1}\gamma_{1}\\ Ey_{t}y_{t-s}&=&a_{1}Ey_{t-1}y_{t-s}+E\varepsilon_{t}y_{t-s}+\beta_{1}E\varepsilon_{t-1}y_{t-s} \Rightarrow \\ \gamma_{s}&=&a_{1}\gamma_{s-1} \end{eqnarray*}\]Solve the equations and get
\[\begin{eqnarray*} \gamma_{0}&=&\frac{1+\beta_{1}^{2}+2a_{1}\beta_{1}}{1-a_{1}^{2}}\sigma^{2}\\ \gamma_{1}&=&\frac{(1+a_{1}\beta_{1})(a_{1}+\beta_{1})}{1-a_{1}^{2}}\sigma^{2} \end{eqnarray*}\]And the AFC
\[\begin{eqnarray*} \rho_{0}&=&1\\ \rho_{1}&=&\frac{\gamma_{1}}{\gamma_{0}}=\frac{(1+a_{1}\beta_{1})(a_{1}+\beta_{1})}{1+\beta_{1}^{2}+2a_{1}\beta_{1}}\\ \rho_{s}&=&a_{1}\rho_{s} \ \ \forall t>1 \end{eqnarray*}\]The Partial Autoorrelation Function
In AR(1) process, \(y_{t}\) and \(y_{t-2}\) are correlated even though \(y_{t-2}\) does not directly appear in the model. \(\rho_{2}=corr(y_{t}, y_{t-1})\times corr(y_{t-1}, y_{t-2})=\rho_{1}^{2}\). All such “indirect” correlations are present in the ACF. In contrast, the partial autocorrelation between \(y_{t}\) and \(y_{t-s}\) climinates the effects of the intervening values \(y_{t-1}\) through \(y_{t-s+1}\).
Method to find the PACF:
- Form the series \(\{y_{t}^{\ast}\}\), where \(y_{t}^{\ast}\equiv y_{t}-\mu\)
- Form the first-order autoregression equation:
where \(\phi_{11}\) is the partial autocorrelation between \(y_{t}\) and \(y_{t-1}\), \(\phi_{22}\) is the partial autocorrelation between \(y_{t}\) and \(y_{t-2}\). Repeating this process for all additional lags s yields the PACF.
\[\begin{array}{lll}\hline \text{Process} & ACF & PACF\\ \hline \text{White-noise}& \rho_{s}=0 & \phi_{ss}=0\\ AR(1): a_{1}>0 &\rho_{s}=a_{1}^{s} &\phi_{11}=\rho_{1}; \phi_{ss}=0 for s\geq2\\ AR(1): a_{1}<0 &\rho_{s}=a_{1}^{s} &\phi_{11}=\rho_{1}; \phi_{ss}=0 for s\geq2\\ AR(p)& \text{Decays toward zero.} & \text{Spikes through lag p} \\ & \text{Coefficients may oscillate} & \phi_{ss}=0 for s\geq p\\ MA(1): \beta>0 & \rho_{s}=0, \text{for} s\geq 2 & \text{Oscillating decay:} \phi_{11}>0\\ MA(1): \beta>0 & \rho_{s}=0, \text{for} s\geq 2 & \text{Decay:} \phi_{11}<0 \\ ARMA(1, 1): a_{1}>0 & &\\ ARMA(1, 1): a_{1}<0 & &\\ ARMA(p, q)& &\\ \hline \end{array}\]Sample Autoorrelations
Given that a series is stationary, we can use the sample mean, variance and autocorrelations to estimate the parameters of the actual data-generating process.
The estimates of \(\mu\), \(\sigma^{2}\) and \(\rho\):
\[\begin{eqnarray*} \overline{y}&=&\frac{\sum_{t=1}^{T}y_{t}}{T}\\ \hat{\sigma^{2}}&=&\frac{\sum_{t=1}^{T}(y_{t}-\overline{y})^{2}}{T}\\ r_{s}&=&\frac{\sum_{t=s+1}^{T}(y_{t}-\overline{y})(y_{t-s}-\overline{y})}{\sum_{t=1}^{T}(y_{t}-\overline{y})^{2}} \end{eqnarray*}\]Box and Pierce used the sample autocorrelations to form the statistic
\[\begin{eqnarray*} Q&=&T\sum_{k=1}^{s}r_{k}^{2} \end{eqnarray*}\]If the data are generated from a stationary ARMA process, Q is asymptotically \(\chi^{2}(s)\) distribution.
Ljung and Box test
\[\begin{eqnarray*} Q&=&T(T+2)\sum_{k=1}^{s}\frac{r_{k}^{2}}{T-k} \sim \chi^{2}(s) \end{eqnarray*}\]