Introduction

Panel data/longitudinal data/repeated measures are repeated observations on the same cross section observed for several time periods.

A major advantage of panel data is increased precision in estimation, which is a result of an increase in the number of observations owing to combining or pooling several time periods of data for each individual. However, for valid statistical inference one needs to control for likely correlation of regression model errors over time for a given individual.

A second attraction is the possibility of consistent estimation of the fixed effects model, which allows for unobserved individual heterogeneity that may be correlated with regressors.

Most disciplines in applied statistics other than microeconometrics treat any unobserved individual heterogeneity as being distributed independently of the regressors. Then the effects are called random effects.

Models and Estimators

Even for linear regression, standard panel data analysis uses a much wider range of models and estimators than is the case with cross-section data.

Obtaining correct standard errors of estimators is also more complicated than in the cross-section case. One needs to control for correlation over time in errors for a given individual, in addition to possible heteroskedasticity.

Panel Data Models

A very general linear model for panel data permits the intercept and slope coefficients to vary over both individual and time, with

\[\begin{eqnarray*} y_{it} & = & \alpha_{it}+x_{it}^{\prime}\beta_{it}+u_{it}, i=1,\dots,N, t=1,\dots,T\\ \end{eqnarray*}\]

where

\(y_{it}\): a scalar dependent
\(x_{it}\): a \(K\times1\) vector of independent variables
\(u_{it}\): a scalar disturbance term
\(i\): index individual in a cross section
\(t\): index time

This model is too general and is not estimable as there are more parameters to estimate than observations. Further restrictions need to be placed on the extent to which \(\alpha_{it}\) and \(\beta_{it}\) vary with \(i\) and \(t\), and on the behavior of the error \(u_{it}\).

Pooled Model

The most restrictive model is a pooled model that specifies constant coefficients, the usual assumption for cross-section analysis, so that

\[\begin{eqnarray} y_{it} & = & \alpha+x_{it}^{\prime}\beta+u_{it}\label{eq:pd1} \end{eqnarray}\]

If this model is correctly specified and regressors are uncorrelated with the error then it can be consistently estimated using pooled OLS. The error term is likely to be correlated over time for a given individual, however, in which case the usual reported standard errors should not be used as they can be greatly downward biased

Individual and Time Dummies

A simple variant of the model (\ref{eq:pd1}) permits intercepts to vary across individuals and over time while slope parameters do not.

\[\begin{eqnarray} y_{it} & = & \alpha_{i}+\gamma_{t}+x_{it}^{\prime}\beta+u_{it}\nonumber \\ & = & \sum_{j=1}^{N}\alpha_{j}d_{j,it}+\sum_{s=2}^{T}\gamma_{s}d_{s,it}+x_{it}^{\prime}\beta+u_{it}\label{eq:pd2} \end{eqnarray}\]

where

\(d_{j,it}\): \(N\) individual dummies equal one if \(i=j\) and equal zero otherwise
\(d_{s,it}\): \(\left(T-1 \right)\) time dummies equal one if \(t=s\) and equal zero otherwise

notice that there is no intercept. If so, then one of the \(N\) individual dummies must be dropped.

The model has \(N+\left(T-1\right)+dimx\) parameters that can be consistently estimated if both \(N\to\infty\) and \(T\to\infty\).

Fixed Effects and Random Effects Models

The individual-specific effects model allows each cross-sectional unit to have a different intercept term though all slopes are the same, so that

\[\begin{eqnarray} y_{it} & = & \alpha_{i}+x_{it}^{\prime}\beta+\varepsilon_{it}\label{eq:pd3} \end{eqnarray}\]

The \(\alpha_{i}\) are random variables that capture unobserved heterogeneity. We make the assumption of strong exogeneity/strict exogeneity

\[\begin{eqnarray} \mathbb{E}\left[\varepsilon_{it}\mid\alpha_{i},x_{it},\dots,x_{iT}\right] & = & 0,\quad t=1,\dots,T\label{eq:pd4} \end{eqnarray}\]

so that the error term is assumed to have mean zero conditional on past, current, and future values of regressors.

One variant of the model (\ref{eq:pd3}) treats \(\alpha_{i}\) as an unobserved random variable that is potentially correlated with the observed regressors \(x_{it}\). This variant is called the fixed effects (FE) modelas early treatments modeled these effects as parameters \(\alpha_{1},\dots,\alpha_{N}\) to be estimated.

The other variant of the model (\ref{eq:pd3}) assumes that the unobservable individual effects \(\alpha_{i}\) are random variables that are distributed independently of the regressors. This model is called the random effect (RE) model/one-way individual-specific random effects model/random intercept model, which makes the additional assumptions that

\[\begin{eqnarray} \alpha_{i} & \sim & \left[\alpha,\sigma_{\alpha}^{2}\right]\label{eq:pd5}\\ \varepsilon_{it} & \sim & \left[0,\sigma_{\varepsilon}^{2}\right]\nonumber \end{eqnarray}\]

so that both the random effects and the error term in (\ref{eq:pd3}) are assumed to be iid.

Equicorrelated Model

The RE model can be viewed as a specialization of the pooled model as the \(\alpha_{i}\) can be subsumed into the error term. Now \(u_{it}=\alpha_{i}+\varepsilon_{it}\) and (\ref{eq:pd5}) implies that

\[\begin{eqnarray} Cov\left[\left(\alpha_{i}+\varepsilon_{it}\right),\left(\alpha_{i}+\varepsilon_{is}\right)\right] & = & \begin{cases} \sigma_{\alpha}^{2} & t\neq s\\ \sigma_{\alpha}^{2}+\sigma_{\varepsilon}^{2} & t=s \end{cases}\label{eq:pd8} \end{eqnarray}\]

The RE imposes the constraint that \(u_{it}\) is equicorrelated, since

\[\begin{eqnarray*} Corr\left[u_{it},u_{is}\right] & = & \frac{Cov\left[u_{it},u_{is}\right]}{SD\left[u_{it}\right]SD\left[u_{is}\right]}=\frac{\sigma_{\alpha}^{2}}{\sigma_{\alpha}^{2}+\sigma_{\varepsilon}^{2}} \end{eqnarray*}\]

for \(t\neq s\) does not vary with the time difference \(t-s\).

FE versus RE Models

We use notation

\[\begin{eqnarray*} y_{it} & = & c_{i}+x_{it}^{\prime}\beta+\varepsilon_{it} \end{eqnarray*}\]

The individual effect is a random variable in both fixed and random effects models. Both models assume that

\[\begin{eqnarray*} \mathbb{E}\left[y_{it}\mid c_{i},x_{it}\right] & = & c_{i}+x_{it}^{\prime}\beta \end{eqnarray*}\]

The individual-specific effect \(c_{i}\) is unknown and in short panels cannot be consistently estimated, so we cannot estimate \(\mathbb{E}\left[y_{it}\mid c_{i},x_{it}\right]\). Then we eliminate \(c_{i}\)

\[\begin{eqnarray*} \mathbb{E}\left[y_{it}x_{it}\right] & = & \mathbb{E}\left[c_{i}x_{it}\right]+x_{it}^{\prime}\beta \end{eqnarray*}\]

For the RE model we assume \(\mathbb{E}\left[c_{i}x_{it}\right]=\alpha\), so that

\[\begin{eqnarray*} \mathbb{E}\left[y_{it}x_{it}\right] & = & \alpha+x_{it}^{\prime}\beta \end{eqnarray*}\]

and hence it is possible to identify \(\mathbb{E}\left[y_{it}x_{it}\right]\).

For FE model \(\mathbb{E}\left[c_{i}x_{it}\right]\) varies with \(x_{it}\), and it is not known how it varies, so we cannot identify \(\mathbb{E}\left[y_{it}x_{it}\right]\). However, we can estimate \(\beta\) with short panels. Thus, it is possible in the FE model to identify the marginal effect

\[\begin{eqnarray*} \beta & = & \frac{\partial\mathbb{E}\left[y_{it}\mid c_{i},x_{it}\right]}{\partial x_{it}} \end{eqnarray*}\]

even the conditional mean is not identified. For not time-varying regressors, e.g. race or gender, the marginal effect is not identified.

Panel Data Estimators

A regressors \(x_{it}\) may be either time-invariant, with \(x_{it}=x_{i}\), for \(t=1,\dots,T\), or time-varying.

Pooled OLS

The pooled OLS estimator is obtained by stacking the data over \(i\) and \(t\) into one long regression with \(NT\) observations, and estimating by OLS

\[\begin{eqnarray*} y_{it} & = & \alpha+x_{it}^{\prime}\beta+u_{it},\;i=1,\dots,N,\:t=1,\dots,T \end{eqnarray*}\]

If the model (\ref{eq:pd1}) is appropriate and \(Cov\left[u_{it},x_{it}\right]=0\) then either \(N\to\infty\) or \(T\to\infty\) is sufficient for consistency.

Lest-squares Dummy Variable Estimator

If \(N\) is not too large, an alternative and simpler way to compute thw within estimator is by LSDV estimation, which directly estimates model (\ref{eq:pd2}) by OLS regression of \(y_{it}\) on \(x_{it}\) and the \(N\) individual dummy variables and yields the within estimator for \(\beta\).

Within Estimator

Begin with the individual-specific effects model (\ref{eq:pd3}). Taking the average over time yields \(\bar{y}_{i}=\alpha_{i}+\overline{x}_{i}^{\prime}\beta+\bar{\varepsilon}_{i}\). Subtracting this from \(y_{it}\) in model (\ref{eq:pd3}) yields the within model

\[\begin{eqnarray} y_{it}-\bar{y}_{i} & = & \left(x_{it}-\bar{x}_{i}\right)^{\prime}\beta+\left(\varepsilon_{it}-\bar{\varepsilon}_{i}\right),\;i=1,\dots,N,\:t=1,\dots,T\label{eq:pd6} \end{eqnarray}\]

The within estimatoris the OLS estimator in within model (\ref{eq:pd6}) and it yields consistent estimates of \(\beta\) in the fiexed effets model.

A major limitation of within estimation is that the coefficients of time-invariant regressors are not identified since if \(x_{it}=x_{i}\), then \(\bar{x}_{i}=x_{i}\), so \(\left(x_{it}-\bar{x}_{i}\right)=0\).

Between Estimator

The between estimator in short panels instead uses just the cross-sectional variation. Averageing over all years yields \(\bar{y}_{i}=\alpha_{i}+\overline{x}_{it}^{\prime}\beta+\bar{\varepsilon}_{i}\), which can be rewritten as the between model

\[\begin{eqnarray*} \bar{y}_{i} & = & \alpha+\overline{x}_{i}^{\prime}\beta+\left(\alpha_{i}-\alpha+\bar{\varepsilon}_{i}\right) \end{eqnarray*}\]

The between estimatoris the OLS estimator from regression of \(\bar{y}_{i}\) on an intercept and \(\overline{x}_{i}^{\prime}\). It uses variation between different individuals.

The between estimator is consistent if \(\overline{x}_{i}^{\prime}\) are independent of the error \(\left(\alpha_{i}-\alpha+\bar{\varepsilon}_{i}\right)\). For FE model, however, between estimator is inconsistent as \(\alpha_{i}\) is assumed to be correlated with \(x_{it}\).

First-Differences Estimator

Lagging one period of model (\ref{eq:pd3}) yields \(y_{it-1}=\alpha_{i}+x_{it-1}^{\prime}\beta+\varepsilon_{it-1}\). Subtracting this from \(y_{it}\) yields the first-differences model

\[\begin{eqnarray} y_{it}-y_{i,t-1} & = & \left(x_{it}-x_{i,t-1}\right)^{\prime}\beta+\left(\varepsilon_{it}-\varepsilon_{i,t-1}\right),\;i=1,\dots,N,\:t=1,\dots,T\label{eq:pd7} \end{eqnarray}\]

The first-differences estimatoris the OLS estimator in (\ref{eq:pd7}). This estimator yields consistent estimates in the fixed effects model, though the coefficients of time-invariant regressors are not identified.

Panel Robust Statistical Inference

The various panel models include error terms denoted \(u_{it}\), \(\varepsilon_{it}\) and \(\alpha_{i}\). It is reasonable to assume independence over \(i\). However, the errors are potentially

serially correlated(correlated over \(t\) for given \(i\))
heteroskedastic

we need to control these factors to get valid statistical inference.

Panel Robust Sandwich Standard Errors}

The panel estimators which introduced above can be obtained by OLS estimation of \(\theta\) in the pooled regression

\[\begin{eqnarray} \widetilde{y}_{it} & = & \widetilde{w}_{it}^{\prime}\theta+\widetilde{u}_{it}\label{eq:pd9} \end{eqnarray}\]

where different panel estimators correspond to different transformations \(\widetilde{y}_{it}\), \(\widetilde{w}_{it}\), \(\widetilde{u}_{it}\), \(y_{it}\), \(w_{it}^{\prime}=\left[1\,x_{it}^{\prime}\right]\) and \(u_{it}\). The key is that \(\widetilde{y}_{it}\) is a known function of only \(y_{i1},\dots,y_{iT}\), and similarly for others.

It is convenient to stack observations over time periods for a given individuals, leading to

\[\begin{eqnarray*} \widetilde{y}_{i} & = & \widetilde{w}_{i}\theta+\widetilde{u}_{i}\\ \end{eqnarray*}\]

where

\(\widetilde{y}_{i}\): \(T\times 1\) vector
\(\widetilde{w}_{i}\): \(T\times q\)

Further stacking over the \(N\) invididuals yields

\[\begin{eqnarray*} \widetilde{y} & = & \widetilde{w}\theta+\widetilde{u} \end{eqnarray*}\]

The OLS estimator are therefore

\[\begin{eqnarray*} \widehat{\theta}_{OLS} & = & \left[\widetilde{w}^{\prime}\widetilde{w}\right]^{-1}\widetilde{w}^{\prime}\widetilde{y}\\ & = & \left[\sum_{i=1}^{N}\widetilde{w}_{i}^{\prime}\widetilde{w}_{i}\right]^{-1}\sum_{i=1}^{N}\widetilde{w}_{i}^{\prime}\widetilde{y}_{i}\\ & = & \left[\sum_{i=1}^{N}\sum_{t=1}^{T}\widetilde{w}_{it}\widetilde{w}_{it}^{\prime}\right]^{-1}\sum_{i=1}^{N}\sum_{t=1}^{T}\widetilde{w}_{it}\widetilde{y}_{it}\\ & where\\ \widetilde{w} & : & NT\times q\\ \widetilde{w}_{i} & : & T\times q\\ \widetilde{w}_{it} & : & 1\times q \end{eqnarray*}\]

The following matrix form equation illustrates the structure of \(\widetilde{y}=\widetilde{w}\theta+\widetilde{u}\)

\[\begin{eqnarray*} y & = & w\theta+u\\ \left[\begin{array}{c} y_{1}=\begin{cases} y_{11}\\ y_{12}\\ \vdots\\ y_{1T} \end{cases}\\ y_{2}=\begin{cases} y_{21}\\ y_{22}\\ \vdots\\ y_{2T} \end{cases}\\ \vdots\\ \vdots\\ y_{N}=\begin{cases} y_{N1}\\ y_{N2}\\ \vdots\\ y_{NT} \end{cases} \end{array}\right] & =\left[\begin{array}{c} w_{1}=\begin{cases} x_{111}\;x_{112}\cdots & x_{11q}=w_{11}^{\prime}\\ x_{121}\;x_{122}\cdots & x_{12q}=w_{12}^{\prime}\\ \vdots & \vdots\\ x_{1T1}\;x_{1T2}\cdots & x_{1Tq}=w_{1T}^{\prime} \end{cases}\\ w_{2}=\begin{cases} x_{211}\;x_{212}\cdots & x_{21q}=w_{21}^{\prime}\\ x_{221}\;x_{222}\cdots & x_{22q}=w_{22}^{\prime}\\ \vdots & \vdots\\ x_{2T1}\;x_{2T2}\cdots & x_{2Tq}=w_{2T}^{\prime} \end{cases}\\ \vdots\\ \vdots\\ w_{N}=\begin{cases} x_{N11}\;x_{N12}\cdots & x_{N1q}=w_{N1}^{\prime}\\ x_{N21}\;x_{N22}\cdots & x_{N2q}=w_{N2}^{\prime}\\ \vdots & \vdots\\ x_{NT1}\;x_{NT2}\cdots & x_{NTq}=w_{NT}^{\prime} \end{cases} \end{array}\right]\left[\begin{array}{c} \theta_{1}\\ \theta_{2}\\ \vdots\\ \theta_{q} \end{array}\right]+ & \left[\begin{array}{c} u_{1}=\begin{cases} u_{11}\\ u_{12}\\ \vdots\\ u_{1T} \end{cases}\\ u_{2}=\begin{cases} u_{21}\\ u_{22}\\ \vdots\\ u_{2T} \end{cases}\\ \vdots\\ \vdots\\ u_{N}=\begin{cases} u_{N1}\\ u_{N2}\\ \vdots\\ u_{NT} \end{cases} \end{array}\right] \end{eqnarray*}\]

Consistency

If the model is correctly specified then

\[\begin{eqnarray*} \widehat{\theta}_{OLS} & = & \theta+\left[\widetilde{w}^{\prime}\widetilde{w}\right]^{-1}\widetilde{w}^{\prime}\widetilde{u}\\ & = & \theta+\left[\sum_{i=1}^{N}\widetilde{w}_{i}^{\prime}\widetilde{w}_{i}\right]^{-1}\sum_{i=1}^{N}\widetilde{w}_{i}^{\prime}\widetilde{u}_{i} \end{eqnarray*}\]

Given independence over \(i\) the essential condition for onsistency is \(\mathbb{E}\left[\widetilde{w}^{\prime}\widetilde{u}\right]=0\).

The asymptotic variance of \(\widehat{\theta}_{OLS}\) is then

\[\begin{eqnarray*} Var\left[\widehat{\theta}_{OLS}\right] & = & \left[\sum_{i=1}^{N}\widetilde{w}_{i}^{\prime}\widetilde{w}_{i}\right]^{-1}\sum_{i=1}^{N}\widetilde{w}_{i}^{\prime}\mathbb{E}\left[\widetilde{u}_{i}\widetilde{u}_{i}^{\prime}\mid\widetilde{w}_{i}\right]\widetilde{w}_{i}\left[\sum_{i=1}^{N}\widetilde{w}_{i}^{\prime}\widetilde{w}_{i}\right]^{-1} \end{eqnarray*}\]

given independence of errors over \(i\).

The panel-robust estimate of the asymptotoc variance matrix of pooled OLS estimator is

\[\begin{eqnarray*} \widehat{Var\left[\widehat{\theta}_{OLS}\right]} & = & \left[\sum_{i=1}^{N}\widetilde{w}_{i}^{\prime}\widetilde{w}_{i}\right]^{-1}\sum_{i=1}^{N}\widetilde{w}_{i}^{\prime}\widehat{u}_{i}\widehat{u}_{i}^{\prime}\widetilde{w}_{i}\left[\sum_{i=1}^{N}\widetilde{w}_{i}^{\prime}\widetilde{w}_{i}\right]^{-1}\\ & = & \left[\sum_{i=1}^{N}\sum_{t=1}^{T}\widetilde{w}_{it}\widetilde{w}_{it}^{\prime}\right]^{-1}\sum_{i=1}^{N}\sum_{t=1}^{T}\sum_{s=1}^{T}\widetilde{w}_{it}\widetilde{w}_{is}^{\prime}\widehat{u}_{it}\widehat{u}_{is}\left[\sum_{i=1}^{N}\sum_{t=1}^{T}\widetilde{w}_{it}\widetilde{w}_{it}^{\prime}\right]^{-1}\\ & where\\ \widehat{u}_{i} & = & \widetilde{y}_{i}-\widetilde{w}_{i}\widehat{\theta}\\ \widehat{u}_{it} & = & \widetilde{y}_{it}-\widetilde{w}_{it}^{\prime}\widehat{\theta} \end{eqnarray*}\]

Pooled Models

Fixed Effects Model

Random Effects Model

\[\begin{eqnarray*} Var\left(v\right) & = & \left[\begin{array}{ccccccccccccccc} & v_{11} & v_{12} & \cdots & v_{1T} & v_{21} & v_{22} & \cdots & v_{2T} & \cdots & \cdots & v_{n1} & v_{n2} & \cdots & v_{nT}\\ v_{11} & \sigma_{\alpha}^{2}+\sigma_{\varepsilon}^{2} & \sigma_{\alpha}^{2} & \cdots & \sigma_{\alpha}^{2}\\ v_{12} & \sigma_{\alpha}^{2} & \sigma_{\alpha}^{2}+\sigma_{\varepsilon}^{2} & \cdots & \sigma_{\alpha}^{2}\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ v_{1T} & \sigma_{\alpha}^{2} & \sigma_{\alpha}^{2} & \cdots & \sigma_{\alpha}^{2}+\sigma_{\varepsilon}^{2}\\ v_{21} & & & & & \sigma_{\alpha}^{2}+\sigma_{\varepsilon}^{2} & \sigma_{\alpha}^{2} & \cdots & \sigma_{\alpha}^{2}\\ v_{22} & & & & & \sigma_{\alpha}^{2} & \sigma_{\alpha}^{2}+\sigma_{\varepsilon}^{2} & \cdots & \sigma_{\alpha}^{2}\\ \vdots & & & & & \vdots & \vdots & \ddots & \vdots\\ v_{2T} & & & & & \sigma_{\alpha}^{2} & \sigma_{\alpha}^{2} & \cdots & \sigma_{\alpha}^{2}+\sigma_{\varepsilon}^{2}\\ \vdots & & & & & & & & & \ddots\\ \vdots & & & & & & & & & & \ddots\\ v_{n1} & & & & & & & & & & & \sigma_{\alpha}^{2}+\sigma_{\varepsilon}^{2} & \sigma_{\alpha}^{2} & \cdots & \sigma_{\alpha}^{2}\\ v_{n2} & & & & & & & & & & & \sigma_{\alpha}^{2} & \sigma_{\alpha}^{2}+\sigma_{\varepsilon}^{2} & \cdots & \sigma_{\alpha}^{2}\\ \vdots & & & & & & & & & & & \vdots & \vdots & \ddots & \vdots\\ v_{nT} & & & & & & & & & & & \sigma_{\alpha}^{2} & \sigma_{\alpha}^{2} & \cdots & \sigma_{\alpha}^{2}+\sigma_{\varepsilon}^{2} \end{array}\right]\\ & = & I_{T}\otimes\Omega \end{eqnarray*}\]