1--- 2output: rmarkdown::html_vignette 3bibliography: ../inst/REFERENCES.bib 4abstract: | 5 This introduction to the `plm` package is a modified 6 and extended version of @Croissant:Millo:2008, published in the 7 *Journal of Statistical Software*. 8 9 Panel data econometrics is obviously one of the main fields 10 in the statistics profession, but most of the models used are difficult to 11 estimate with only plain `R`. `plm` is a package for `R` which 12 intends to make the estimation of linear panel models 13 straightforward. `plm` provides functions to estimate a wide 14 variety of models and to make (robust) inference. 15title: 'Panel data econometrics in R:' 16subtitle: 'the plm package' 17author: 18- name: Yves Croissant 19- name: Giovanni Millo 20vignette: > 21 %\VignetteIndexEntry{Panel data econometrics in R: the plm package} 22 %\VignetteEngine{knitr::rmarkdown} 23 %\VignetteEncoding{UTF-8} 24 25--- 26 27# Introduction 28 29Panel data econometrics is a continuously developing field. The 30increasing availability of data observed on cross-sections of units 31(like households, firms, countries etc.) *and* over time has given 32rise to a number of estimation approaches exploiting this double 33dimensionality to cope with some of the typical problems associated 34with economic data, first of all that of unobserved heterogeneity. 35 36Timewise observation of data from different observational units has 37long been common in other fields of statistics (where they are often 38termed *longitudinal* data). In the panel data field as well as in 39others, the econometric approach is nevertheless peculiar with respect 40to experimental contexts, as it is emphasizing model specification and 41testing and tackling a number of issues arising from the particular 42statistical problems associated with economic data. 43 44Thus, while a very comprehensive software framework for (among many 45other features) maximum likelihood estimation of linear regression 46models for longitudinal data, packages `nlme` 47[@PINH:BATE:DEBR:SARK:07] and `lme4` 48[@BATE:07], is available in the `R` (@R:2008) 49environment and can be used, e.g., for estimation of random effects 50panel models, its use is not intuitive for a practicing 51econometrician, and maximum likelihood estimation is only one of the 52possible approaches to panel data econometrics. Moreover, economic 53panel data sets often happen to be *unbalanced* (i.e., they have a 54different number of observations between groups), which case needs 55some adaptation to the methods and is not compatible with those in 56`nlme`. Hence the need for a package doing panel data "from the 57econometrician's viewpoint" and featuring at a minimum the basic 58techniques econometricians are used to: random and fixed 59effects estimation of static linear panel data models, variable 60coefficients models, generalized method of moments estimation of 61dynamic models; and the basic toolbox of specification and 62misspecification diagnostics. 63 64Furthermore, we felt there was a need for automation of some basic 65data management tasks such as lagging, summing and, more in general, 66`apply`ing (in the `R` sense) functions to the data, which, 67although conceptually simple, become cumbersome and error-prone on 68two-dimensional data, especially in the case of unbalanced panels. 69 70This paper is organized as follows: Section [linear panel model](#linear-panel-model) 71presents a very short overview of the typical model 72taxonomy^[Comprehensive treatments are to be found in many econometrics textbooks, 73e.g., @BALT:05, @BALT:13, @BALT:21 or @WOOL:02, @WOOL:10: the reader is referred to 74these, especially to the first 9 chapters of @BALT:05, @BALT:13, @BALT:21.]. 75Section [software approach](#software-approach) discusses the 76software approach used in the package. The next three sections present 77the functionalities of the package in more detail: data management 78(Section [managing data and formulae](#managing-data-and-formulae)), estimation 79(Section [model estimation](#model-estimation)) and testing (Section [tests](#tests)), 80giving a short description and illustrating them with 81examples. Section [plm vs nlme and lme4](#nlme) compares the approach in `plm` to 82that of `nlme` and `lme4`, highlighting the features of the 83latter two that an econometrician might find most useful. 84Section [conclusion](#conclusions) concludes the paper. 85 86 87# The linear panel model{#linear-panel-model} 88 89The basic linear panel models used in econometrics can be described 90through suitable restrictions of the following general model: 91 92\begin{equation*} 93 y_{it}=\alpha_{it} + \beta_{it}^\top x_{it} + u_{it} 94\end{equation*} 95 96where $i=1, ..., n$ is the individual (group, country ...) index, 97$t=1, ..., T$ is the time index and $u_{it}$ a random disturbance 98term of mean $0$. 99 100Of course $u_{it}$ is not estimable with $N = n \times T$ data 101points. A number of assumptions are usually made about the parameters, 102the errors and the exogeneity of the regressors, giving rise to a 103taxonomy of feasible models for panel data. 104 105The most common one is parameter homogeneity, which means that 106$\alpha_{it}=\alpha$ for all $i,t$ and $\beta_{it}=\beta$ for all 107$i,t$. The resulting model 108 109\begin{equation*} 110 y_{it}=\alpha + \beta^\top x_{it} + u_{it} 111\end{equation*} 112 113is a standard linear model pooling all the data across $i$ and $t$. 114 115To model individual heterogeneity, one often assumes that the error 116term has two separate components, one of which is specific to the 117individual and doesn't change over time^[For the sake of exposition we are 118considering only the individual effects case here. There may also be time 119effects, which is a symmetric case, or both of them, so that the error has 120three components: $u_{it}=\mu_{i}+\lambda_{t}+\epsilon_{it}$.]. 121This is called the unobserved effects model: 122 123\begin{equation} 124 (\#eq:errcomp) 125 y_{it}=\alpha + \beta^\top x_{it} + \mu_i + \epsilon_{it} 126\end{equation} 127 128The appropriate estimation method for this model depends on the 129properties of the two error components. The idiosyncratic error 130$\epsilon_{it}$ is usually assumed well-behaved and independent of 131both the regressors $x_{it}$ and the individual error component 132$\mu_i$. The individual component may be in turn either independent of 133the regressors or correlated. 134 135If it is correlated, the ordinary least squares (OLS) estimator of 136$\beta$ would be inconsistent, so it is customary to treat the $\mu_i$ 137as a further set of $n$ parameters to be estimated, as if in the 138general model $\alpha_{it}=\alpha_{i}$ for all $t$. This is called the 139fixed effects (a.k.a. *within* or *least squares dummy variables*) 140model, usually estimated by OLS on transformed data, and gives 141consistent estimates for $\beta$. 142 143If the individual-specific component $\mu_i$ is uncorrelated with the 144regressors, a situation which is usually termed *random effects*, the 145overall error $u_{it}$ also is, so the OLS estimator is 146consistent. Nevertheless, the common error component over individuals 147induces correlation across the composite error terms, making OLS 148estimation inefficient, so one has to resort to some form of feasible 149generalized least squares (GLS) estimators. This is based on the 150estimation of the variance of the two error components, for which 151there are a number of different procedures available. 152 153If the individual component is missing altogether, pooled OLS is the 154most efficient estimator for $\beta$. This set of assumptions is 155usually labelled *pooling* model, although this actually refers to the 156errors' properties and the appropriate estimation method rather than 157the model itself. If one relaxes the usual hypotheses of 158well-behaved, white noise errors and allows for the idiosyncratic 159error $\epsilon_{it}$ to be arbitrarily heteroskedastic and serially 160correlated over time, a more general kind of feasible GLS is needed, 161called the *unrestricted* or *general* GLS. This specification can 162also be augmented with individual-specific error components possibly 163correlated with the regressors, in which case it is termed *fixed 164effects* GLS. 165 166Another way of estimating unobserved effects models through removing 167time-invariant individual components is by first-differencing the 168data: lagging the model and subtracting, the time-invariant components 169(the intercept and the individual error component) are eliminated, and 170the model 171 172\begin{equation*} 173 \Delta y_{it}= \beta^\top \Delta x_{it} + \Delta u_{it} 174\end{equation*} 175 176(where $\Delta y_{it}=y_{it}-y_{i,t-1}$, $\Delta 177x_{it}=x_{it}-x_{i,t-1}$ and, from \@ref(eq:errcomp), $\Delta 178u_{it}=u_{it}-u_{i,t-1}=\Delta \epsilon_{it}$ for $t=2,...,T$) can be 179consistently estimated by pooled OLS. This is called the *first-difference* 180or FD estimator. Its relative efficiency, and so reasons for choosing it 181against other consistent alternatives, depends on the properties of the error 182term. The FD estimator is usually preferred if the errors $u_{it}$ are 183strongly persistent in time, because then the $\Delta u_{it}$ will 184tend to be serially uncorrelated. 185 186Lastly, the *between* model, which is computed on time (group) 187averages of the data, discards all the information due to intragroup 188variability but is consistent in some settings (e.g., non-stationarity) 189where the others are not, and is often preferred to estimate long-run 190relationships. 191 192Variable coefficients models relax the assumption that 193$\beta_{it}=\beta$ for all $i,t$. Fixed coefficients models allow the 194coefficients to vary along one dimension, like $\beta_{it}=\beta_i$ 195for all $t$. Random coefficients models instead assume that 196coefficients vary randomly around a common average, as 197$\beta_{it}=\beta+\eta_{i}$ for all $t$, where $\eta_{i}$ is a group-- 198(time--) specific effect with mean zero. 199 200 201 202<!-- 203The hypotheses on parameters and error terms (and hence the choice of 204the most appropriate estimator) are usually tested by means of: *pooling* 205tests to check poolability, i.e., the hypothesis that the 206same coefficients apply across all individuals; and random effects 207tests, comparing the null of spherical residuals with the alternative 208of group (time) specific effects in the error term. If the homogeneity 209assumption over the $\beta$s is established, the next step is to 210establish the presence of unobserved effects. The choice between fixed 211and random effects specifications is based on Hausman-type tests, 212comparing the two estimators under the null of no significant 213difference: if this is not rejected, the more efficient random effects 214estimator is chosen. Even after this step, departures of the error 215structure from sphericity can further affect inference, so that either 216screening tests or robust diagnostics are needed. 217--> 218 219 220The hypotheses on parameters and error terms (and hence the choice of 221the most appropriate estimator) are usually tested by means of: 222 223 224- *pooling* tests to check poolability, i.e., the hypothesis 225 that the same coefficients apply across all individuals, 226- if the homogeneity assumption over the coefficients is 227 established, the next step is to establish the presence of 228 unobserved effects, comparing the null of spherical residuals with 229 the alternative of group (time) specific effects in the error term, 230- the choice between fixed and random effects specifications is 231 based on Hausman-type tests, comparing the two estimators under the 232 null of no significant difference: if this is not rejected, the more 233 efficient random effects estimator is chosen, 234- even after this step, departures of the error structure from 235 sphericity can further affect inference, so that either screening 236 tests or robust diagnostics are needed. 237 238 239 240 241<!-- 242Dynamic models in a fixed or random effects setting, and in general 243lack of strict exogeneity of the regressors, pose further problems to 244estimation which are usually dealt with in the generalized method of 245moments (GMM) framework. 246--> 247 248Dynamic models and in general lack of strict exogeneity of the 249regressors, pose further problems to estimation which are usually 250dealt with in the generalized method of moments (GMM) 251framework. 252 253These were, in our opinion, the basic requirements of a panel data 254econometrics package for the `R` language and 255environment. Some, as often happens with `R`, were already 256fulfilled by packages developed for other branches of computational 257statistics, while others (like the fixed effects or the between 258estimators) were straightforward to compute after transforming the 259data, but in every case there were either language inconsistencies 260w.r.t. the standard econometric toolbox or subtleties to be dealt with 261(like, for example, appropriate computation of standard errors for the 262demeaned model, a common pitfall), so we felt there was need for an 263"all in one" econometrics-oriented package allowing to make 264specification searches, estimation and inference in a natural way. 265 266# Software approach{#software-approach} 267 268## Data structure 269 270Panel data have a special structure: each row of the data corresponds 271to a specific individual and time period. In `plm` the `data` 272argument may be an ordinary `data.frame` but, in this case, an 273argument called `index` has to be added to indicate the structure 274of the data. This can be: 275 276- `NULL` (the default value), it is then assumed 277 that the first two columns contain the individual and the time 278 index and that observations are ordered by individual and by time period, 279- a character string, which should be the name of the individual 280 index, 281- a character vector of length two containing the names of the 282 individual and the time index, 283- an integer which is the number of individuals (only in case of a 284 balanced panel with observations ordered by individual). 285 286 287The `pdata.frame` function is then called internally, which returns a 288`pdata.frame` which is a `data.frame` with an attribute called 289index. This attribute is a `data.frame` that contains the individual 290and the time indexes. 291 292It is also possible to use directly the `pdata.frame` function 293and then to use the `pdata.frame` in the estimation functions. 294 295## Interface 296 297### Estimation interface 298 299Package `plm` provides various functions for panel data estimation, among them: 300 301- `plm`: estimation of the 302 basic panel models and instrumental variable panel models, *i.e.*, between and 303 first-difference models and within and random effect models. Models are 304 estimated internally using the `lm` function on transformed data, 305- `pvcm`: estimation of models with variable coefficients, 306- `pgmm`: estimation of generalized method of moments models, 307- `pggls`: estimation of general feasible generalized least squares 308 models, 309- `pmg`: estimators for mean groups (MG), demeaned MG (DMG) and common 310 correlated effects MG (CCEMG) for heterogeneous panel models, 311- `pcce`: estimators for common correlated effects mean groups (CCEMG) and 312 pooled (CCEP) for panel data with common factors, 313- `pldv`: panel estimators for limited dependent variables. 314 315 316The interface of these functions is consistent with the `lm()` 317function. Namely, their first two arguments are `formula` and 318`data` (which should be a `data.frame` and is 319mandatory). Three additional arguments are common to these functions: 320 321- `index`: this argument enables the estimation functions 322 to identify the structure of the data, *i.e.*, the individual 323 and the time period for each observation, 324- `effect`: the kind of effects to include in the model, 325 *i.e.*, individual effects, time effects or both^[Although in most models the 326 individual and time effects cases are symmetric, there are exceptions: 327 estimating the *first-difference* model on time effects is meaningless 328 because cross-sections do not generally have a natural ordering, so trying 329 `effect = "time"` stops with an error message as does `effect = "twoways"` 330 which is not defined for first-difference models.], 331- `model`: the kind of model to be estimated, most of the 332time a model with fixed effects or a model with random effects. 333 334The results of these four functions are stored in an object which class 335has the same name of the function. They all inherit from class 336`panelmodel`. A `panelmodel` object contains: 337`coefficients`, `residuals`, `fitted.values`, 338`vcov`, `df.residual` and `call` and functions 339that extract these elements are provided. 340 341<!-- TODO: currently, plm objects do not contain fitted.values --> 342 343 344### Testing interface 345 346The diagnostic testing interface provides both `formula` and 347`panelmodel` methods for most functions, with some exceptions. 348The user may thus choose whether to employ results stored in a 349previously estimated `panelmodel` object or to re-estimate it 350for the sake of testing. 351 352Although the first strategy is the most efficient one, diagnostic testing 353on panel models mostly employs OLS residuals from pooling 354model objects, whose estimation is computationally inexpensive. 355Therefore most examples in the following are based on `formula` 356methods, which are perhaps the cleanest for illustrative purposes. 357 358<!-- 359In some cases where tests are to be done on model objects that be, in 360computational terms, more demanding to estimate (and that are also 361likely to be already present in the workspace at that stage), an 362exception is made to the rule and tests require a `panelmodel` 363object argument. This is the case of Hausman tests and of 364Wooldridge-type serial correlation tests. 365--> 366 367## Computational approach to estimation 368 369The feasible GLS methods needed for efficient estimation of 370unobserved effects models have a simple closed-form solution: once the 371variance components have been estimated and hence the covariance 372matrix of errors $\hat{V}$, model parameters can be estimated as 373 374\begin{equation} 375 (\#eq:naive) 376 \hat{\beta}=(X^\top \hat{V}^{-1} X)^{-1} (X^\top \hat{V}^{-1} y) 377\end{equation} 378 379Nevertheless, in practice plain computation of $\hat{\beta}$ has long 380been an intractable problem even for moderate-sized data sets because 381of the need to invert the $N\times N$ $\hat{V}$ matrix. With the 382advances in computer power, this is no more so, and it is possible to 383program the "naive" estimator \@ref(eq:naive) in `R` with standard 384matrix algebra operators and have it working seamlessly for the 385standard "guinea pigs", e.g., the Grunfeld data. Estimation with a 386couple of thousands of data points also becomes feasible on a modern 387machine, although excruciatingly slow and definitely not suitable for 388everyday econometric practice. Memory limits would also be very near 389because of the storage needs related to the huge $\hat{V}$ matrix. An 390established solution exists for the random effects model which reduces 391the problem to an ordinary least squares computation. 392 393 394### The (quasi--)demeaning framework 395 396The estimation methods for the basic models in panel data 397econometrics, the pooled OLS, random effects and fixed effects (or 398within) models, can all be described inside the OLS estimation 399framework. In fact, while pooled OLS simply pools data, the standard 400way of estimating fixed effects models with, say, group (time) effects 401entails transforming the data by subtracting the average over time 402(group) to every variable, which is usually termed *time-demeaning*. 403In the random effects case, the various feasible 404GLS estimators which have been put forth to tackle the issue of serial 405correlation induced by the group-invariant random effect have been 406proven to be equivalent (as far as estimation of $\beta$s is 407concerned) to OLS on *partially demeaned* data, where partial 408demeaning is defined as: 409 410\begin{equation} 411 (\#eq:ldemmodel) 412 y_{it} - \theta \bar{y}_i = ( X_{it} - \theta \bar{X}_{i} ) \beta + ( u_{it} - \theta \bar{u}_i ) 413\end{equation} 414 415where $\theta=1-[\sigma_u^2 / (\sigma_u^2 + T \sigma_e^2)]^{1/2}$, 416$\bar{y}$ and $\bar{X}$ denote time means of $y$ and $X$, and the 417disturbance $v_{it} - \theta \bar{v}_i$ is homoskedastic and serially 418uncorrelated. Thus the feasible RE estimate for $\beta$ may be 419obtained estimating $\hat{\theta}$ and running an OLS 420regression on the transformed data with `lm()`. The other 421estimators can be computed as special cases: for $\theta=1$ one gets 422the fixed effects estimator, for $\theta=0$ the pooled OLS 423one. 424 425Moreover, instrumental variable estimators of all these models may also 426be obtained using several calls to `lm()`. 427 428For this reason the three above estimators have been grouped inside 429the same function. 430 431On the output side, a number of diagnostics and a very general 432coefficients' covariance matrix estimator also benefits from this 433framework, as they can be readily calculated applying the standard 434OLS formulas to the demeaned data, which are contained inside 435`plm` objects. This will be the subject of 436subsection [inference in the panel model](#inference). 437 438 439### The object oriented approach to general GLS computations 440 441The covariance matrix of errors in general GLS models is too 442generic to fit the quasi-demeaning framework, so this method calls for 443a full-blown application of GLS as in \@ref(eq:naive). On the 444other hand, this estimator relies heavily on $n$--asymptotics, making it 445theoretically most suitable for situations which forbid it 446computationally: e.g., "short" micropanels with thousands of 447individuals observed over few time periods. 448 449`R` has general facilities for fast matrix computation based on object 450orientation: particular types of matrices (symmetric, sparse, dense 451etc.) are assigned the relevant class and the additional information 452on structure is used in the computations, sometimes with dramatic 453effects on performance (see @BATE:04) and packages `Matrix` (see 454@BATE:MAEC:2016) and `SparseM` (see @KOEN:NG:2016). Some optimized 455linear algebra routines are available in the `R` package `bdsmatrix` 456(see @THER:14) which exploit the particular block-diagonal and 457symmetric structure of $\hat{V}$ making it possible to implement a 458fast and reliable full-matrix solution to problems of any practically 459relevant size. 460 461The $\hat{V}$ matrix is constructed as an object of class 462`bdsmatrix`. The peculiar properties of this matrix class are used for 463efficiently storing the object in memory and then by ad-hoc versions 464of the `solve` and `crossprod` methods, dramatically reducing 465computing times and memory usage. The resulting matrix is then used 466"the naive way" as in \@ref(eq:naive) to compute $\hat{\beta}$, 467resulting in speed comparable to that of the demeaning solution. 468 469 470## Inference in the panel model{#inference} 471 472 473General frameworks for restrictions and linear hypotheses testing are 474available in the `R` environment^[See packages `lmtest` 475(@HOTH:ZEIL:FARE:CUMM:MILL:MITC:2015) and `car` (@FOX:2016).]. 476These 477are based on the Wald test, constructed as $\hat{\beta}^\top \hat{V}^{-1} 478\hat{\beta}$, where $\hat{\beta}$ and $\hat{V}$ are consistent 479estimates of $\beta$ and $V(\beta)$, The Wald test may be used for 480zero-restriction (i.e., significance) testing and, more generally, for 481linear hypotheses in the form $(R \hat{\beta} - r)^\top [R \hat{V} R^\top ]^{-1} (R 482\hat{\beta} - r)$^[Moreover, `coeftest()` provides a compact way of looking at 483coefficient estimates and significance diagnostics.]. 484To be applicable, the test functions require 485extractor methods for coefficients' and covariance matrix estimates to 486be defined for the model object to be tested. Model objects in 487`plm` all have `coef()` and `vcov()` methods and are 488therefore compatible with the above functions. 489 490In the same framework, robust inference is accomplished substituting 491("plugging in") a robust estimate of the coefficient covariance 492matrix into the Wald statistic formula. In the panel context, the 493estimator of choice is the White system estimator. This called for a 494flexible method for computing robust coefficient covariance matrices 495*à la White* for `plm` objects. 496 497A general White system estimator for panel data is: 498 499\begin{equation*} 500 \hat{V}_R(\beta)=(X^\top X)^{-1} \sum_{i=1}^n{X_i^\top E_i X_i} (X^\top X)^{-1} 501\end{equation*} 502 503where $E_i$ is a function of the residuals $\hat{e}_{it}, \; t=1, 504\dots T$ chosen according to the relevant heteroskedasticity and 505correlation structure. Moreover, it turns out that the White 506covariance matrix calculated on the demeaned model's regressors and 507residuals (both part of `plm` objects) is a consistent estimator 508of the relevant model's parameters' covariance matrix, thus the method 509is readily applicable to models estimated by random or fixed effects, 510first difference or pooled OLS methods. Different 511pre-weighting schemes taken from package `sandwich` 512(see @ZEIL:04; @LUML:ZEIL:2015) are also implemented to improve small-sample 513performance. Robust estimators with any combination of covariance 514structures and weighting schemes can be passed on to the testing 515functions. 516 517 518# Managing data and formulae{#dataformula} 519 520The package is now illustrated by application to some well-known 521examples. It is loaded using 522 523```{r echo=FALSE,results='hide'} 524options(prompt= "R> ", useFancyQuotes = FALSE, scipen = 999) 525library("knitr") 526opts_chunk$set(message = FALSE, warning = FALSE) 527``` 528 529```{r echo=TRUE, results='hide'} 530library("plm") 531``` 532 533The four data sets used are `EmplUK` which was used by 534@AREL:BOND:91, the `Grunfeld` data [@KLEI:ZEIL:08] 535which is used in several econometric books, the `Produc` data 536used by @MUNN:90 and the `Wages` used by @CORN:RUPE:88. 537 538```{r } 539data("EmplUK", package="plm") 540data("Produc", package="plm") 541data("Grunfeld", package="plm") 542data("Wages", package="plm") 543``` 544 545 546## Data structure 547 548As observed above, the current version of `plm` is capable of 549working with a regular `data.frame` without any further 550transformation, provided that the individual and time indexes are in 551the first two columns, as in all the example data sets but `Wages`. 552If this weren't the case, an `index` optional argument would 553have to be passed on to the estimating and testing functions. 554 555```{r setdata1} 556head(Grunfeld) 557E <- pdata.frame(EmplUK, index=c("firm","year"), drop.index=TRUE, row.names=TRUE) 558head(E) 559head(attr(E, "index")) 560``` 561 562Two further arguments are logical: `drop.index = TRUE` drops the 563indexes from the `data.frame` and `row.names = TRUE` computes 564"fancy" row names by pasting the individual and the time 565indexes. While extracting a series from a `pdata.frame`, a 566`pseries` is created, which is the original series with the 567index attribute. This object has specific methods, like 568`summary` and `as.matrix`. The former 569indicates the total variation of the variable and the shares of this 570variation due to the individual and the time dimensions. The 571latter gives the matrix representation of the series, with, by default, 572individuals as rows and times as columns. 573 574```{r } 575summary(E$emp) 576head(as.matrix(E$emp)) 577``` 578## Data transformation 579 580Panel data estimation requires to apply different transformations to 581raw series. If $x$ is a series of length $nT$ (where $n$ is the number 582of individuals and $T$ is the number of time periods), the transformed 583series $\tilde{x}$ is obtained as $\tilde{x}=Mx$ where $M$ is a 584transformation matrix. Denoting $j$ a vector of one of length $T$ and 585$I_n$ the identity matrix of dimension $n$, we get: 586 587- the between transformation: $P=\frac{1}{T}I_n\otimes jj'$ 588 returns a vector containing the individual means. The 589 `Between` and `between` functions perform this 590 operation, the first one returning a vector of length $nT$, the 591 second one a vector of length $n$, 592- the within transformation: $Q=I_{nT}-P$ returns a vector 593 containing the values in deviation from the individual means. The 594 `Within` function performs this operation. 595- the first difference transformation $D=I_n \otimes d$ where 596 597$d=\left( 598\begin{array}{ccccccc} 5991 & -1 & 0 & 0 & ... & 0 & 0 \\ 6000 & 1 & -1 & 0 & ... & 0 & 0 \\ 6010 & 0 & 1 & -1 & ... & 0 & 0 \\ 602\vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 6030 & 0 & 0 & 0 & ... & 1 & -1 \\ 604\end{array} 605\right)$ 606 607is of dimension $(T-1,T)$. 608 609Note that `R`'s `diff()` and `lag()` functions 610don't compute correctly these transformations for panel data because 611they are unable to identify when there is a change in individual in 612the data. Therefore, specific methods for `pseries` objects have 613been written in order to handle correctly panel data. Note that 614compared to the `lag()` method for `ts` objects, the order 615of lags are indicated by a positive integer. Moreover, 0 is a relevant 616value and a vector argument may be provided: 617 618```{r } 619head(lag(E$emp, 0:2)) 620``` 621 622Further functions called `Between`, `between` and 623`Within` are also provided to compute the between and the within 624transformation. The `between` returns unique values, whereas 625`Between` duplicates the values and returns a vector which 626length is the number of observations. 627 628```{r } 629head(diff(E$emp), 10) 630head(lag(E$emp, 2), 10) 631head(Within(E$emp)) 632head(between(E$emp), 4) 633head(Between(E$emp), 10) 634``` 635 636 637## Formulas 638 639In some circumstances, standard `formula`s are not very 640useful to describe a model, notably while using instrumental variable 641like estimators: to deal with these situations, we use the 642`Formula` package. 643 644The `Formula` package provides a class which enables to construct 645multi-part formula, each part being separated by a pipe sign (`|`). 646 647The two formulas below are identical: 648```{r results='hide'} 649emp ~ wage + capital | lag(wage, 1) + capital 650emp ~ wage + capital | . -wage + lag(wage, 1) 651``` 652 653In the second case, the `.` means the previous parts which 654describes the covariates and this part is "updated". This is 655particularly interesting when there are a few external instruments. 656 657# Model estimation{#modelestimation} 658 659 660## Estimation of the basic models with plm 661 662 663Several models can be estimated with `plm` by filling the 664`model` argument: 665 666- the fixed effects model (`"within"`), the default, 667- the pooling model (`"pooling"`), 668- the first-difference model (`"fd"`), 669- the between model (`"between"`), 670- the error components model (`"random"`). 671 672The basic use of `plm` is to indicate the model formula, the 673data and the model to be estimated. For example, the fixed effects 674model and the random effects model are estimated using: 675 676```{r fe_re} 677grun.fe <- plm(inv~value+capital, data = Grunfeld, model = "within") 678grun.re <- plm(inv~value+capital, data = Grunfeld, model = "random") 679``` 680Methods to display a sumamry of the model estimation are available via `summary`. 681For example, for a `random` model, the `summary` method gives information 682about the variance of the components of the errors and some test statistics. 683Random effects of the estimated model can be extracted via `ranef`. 684```{r summary_re} 685summary(grun.re) 686ranef(grun.re) 687``` 688 689The fixed effects of a fixed effects model may be extracted easily using `fixef`. 690An argument `type` indicates how fixed effects should be computed: in levels by 691`type = "level"` (the default), in deviations from the overall mean by 692`type = "dmean"` or in deviations from the first individual by 693`type = "dfirst"`. 694 695```{r } 696fixef(grun.fe, type = "dmean") 697``` 698 699The `fixef` function returns an object of class `fixef`. A 700summary method is provided, which prints the effects (in deviation 701from the overall intercept), their standard 702errors and the test of equality to the overall intercept. 703 704```{r } 705summary(fixef(grun.fe, type = "dmean")) 706``` 707 708In case of a two-ways fixed effect model, argument `effect` is relevant in function 709`fixef` to extract specific effect fixed effects with possible values `"individual"` 710for individual fixed effects (default for two-ways fixed effect models), `"time"` 711for time fixed effects, and `"twoways"` for the sum of individual and time fixed 712effects. Example to extract the time fixed effects from a two-ways model: 713 714```{r } 715grun.twfe <- plm(inv~value+capital, data=Grunfeld, model="within", effect="twoways") 716fixef(grun.twfe, effect = "time") 717``` 718 719## More advanced use of plm 720 721### Random effects estimators 722 723As observed above, the random effect model is obtained as a linear 724estimation on quasi-demeaned data. The parameter of this 725transformation is obtained using preliminary estimations. 726 727Four estimators of this parameter are available, depending on the 728value of the argument `random.method`: 729 730- `"swar"`: from @SWAM:AROR:72, the default value, 731- `"walhus"`: from @WALL:HUSS:69, 732- `"amemiya"`: from @AMEM:71, 733- `"nerlove"`: from @NERLO:71. 734- `"ht"`: for Hausman-Taylor-type instrumental variable (IV) estimation, 735discussed later, see Section [Instrumental variable estimator](#instrumental-variable-est). 736 737For example, to use the `amemiya` estimator: 738 739```{r } 740grun.amem <- plm(inv~value+capital, data=Grunfeld, 741 model="random", random.method="amemiya") 742``` 743 744The estimation of the variance of the error components are performed 745using the `ercomp` function, which has a `method` and an 746`effect` argument, and can be used by itself: 747 748```{r } 749ercomp(inv~value+capital, data=Grunfeld, method = "amemiya", effect = "twoways") 750``` 751 752### Introducing time or two-ways effects 753 754The default behavior of `plm` is to introduce individual 755effects. Using the `effect` argument, one may also introduce: 756 757- time effects (`effect = "time"`), 758- individual and time effects (`effect = "twoways"`). 759 760For example, to estimate a two-ways effect model for the 761`Grunfeld` data: 762 763```{r 2RE-amemiya} 764grun.tways <- plm(inv~value+capital, data = Grunfeld, effect = "twoways", 765 model = "random", random.method = "amemiya") 766summary(grun.tways) 767``` 768 769In the "effects" section of the printed summary of the result, the variance 770of the three elements of the error term and the three parameters used in the 771transformation are printed. 772 773### Unbalanced panels 774 775Estimations by `plm` support unbalanced panel models. 776 777The following example is using data used by @HARR:RUBI:78 to 778estimate an hedonic housing prices function. It is reproduced in 779@BALT:CHAN:94, table 2 (and in @BALT:05, pp. 172/4; @BALT:13, pp. 195/7 780tables 9.1/3). 781 782```{r hedonic} 783data("Hedonic", package = "plm") 784Hed <- plm(mv~crim+zn+indus+chas+nox+rm+age+dis+rad+tax+ptratio+blacks+lstat, 785 data = Hedonic, model = "random", index = "townid") 786summary(Hed) 787``` 788Measures for the unbalancedness of a panel data set or the data used in 789estimated models are provided by function `punbalancedness`. It gives 790the measures $\gamma$ and $\nu$ from @AHRE:PINC:81 where for both 1 represents 791balanced data and the more unbalanced the data the lower the value. 792 793```{r hedonic-punbal} 794punbalancedness(Hed) 795``` 796 797### Instrumental variable estimators{#instrumental-variable-est} 798 799 800All of the models presented above may be estimated using instrumental 801variables. The instruments are specified at the end of the formula 802after a `|` sign (pipe). 803 804The instrumental variables estimator used is indicated with the 805`inst.method` argument: 806 807- `"bvk"`, from @BALE:VARA:87, the default value, 808- `"baltagi"`, from @BALT:81, 809- `"am"`, from @AMEM:MACU:86, 810- `"bms"`, from @BREU:MIZO:SCHM:89. 811 812An illustration is in the following example from 813@BALT:05, p. 120; @BALT:13, p. 137; @BALT:21, p. 165, table 7.3 ("G2SLS"). 814 815```{r G2SLS} 816data("Crime", package = "plm") 817cr <- plm(lcrmrte ~ lprbarr + lpolpc + lprbconv + lprbpris + lavgsen + 818 ldensity + lwcon + lwtuc + lwtrd + lwfir + lwser + lwmfg + lwfed + 819 lwsta + lwloc + lpctymle + lpctmin + region + smsa + factor(year) 820 | . - lprbarr - lpolpc + ltaxpc + lmix, 821 data = Crime, model = "random") 822summary(cr) 823``` 824 825The Hausman-Taylor model (see @HAUS:TAYL:81) may be estimated 826with the `plm`^[Function `pht` is a deprecated way to estimate this type of model: 827`ht <- pht(lwage~wks+south+smsa+married+exp+I(exp^2)+ bluecol+ind+union+sex+black+ed 828| sex+black+bluecol+south+smsa+ind, data=Wages,index=595)`.] 829function by setting parameters `random.method = "ht"` and 830`inst.method = "baltagi"` like in 831the example below. The following replicates @BALT:05, pp. 129/30; 832@BALT:13, pp. 145/6, tables 7.4/5; @BALT:21, pp. 174/5 tables 7.5/6: 833 834```{r hausman-taylor} 835ht <- plm(lwage ~ wks + south + smsa + married + exp + I(exp ^ 2) + 836 bluecol + ind + union + sex + black + ed | 837 bluecol + south + smsa + ind + sex + black | 838 wks + married + union + exp + I(exp ^ 2), 839 data = Wages, index = 595, 840 model = "random", random.method = "ht", inst.method = "baltagi") 841summary(ht) 842``` 843 844 845## Variable coefficients model 846 847The `pvcm` function enables the estimation of variable 848coefficients models. Time or individual effects are introduced if argument 849`effect` is fixed to `"time"` or `"individual"` (the default value). 850 851Coefficients are assumed to be fixed if `model="within"` or 852random if `model="random"`. In the first case, a different 853model is estimated for each individual (or time period). In the second 854case, the Swamy model (see @SWAM:70) model is estimated. It is a generalized 855least squares model which uses the results of the previous 856model. Denoting $\hat{\beta}_i$ the vectors of coefficients obtained 857for each individual, we get: 858 859\begin{equation*} 860\hat{\beta}=\left(\sum_{i=1}^n \left(\hat{\Delta}+\hat{\sigma}_i^2(X_i^\top X_i)^{-1}\right)^{-1}\right)\left(\hat{\Delta}+\hat{\sigma}_i^2(X_i^\top X_i)^{-1}\right)^{-1}\hat{\beta}_i 861\end{equation*} 862 863where $\hat{\sigma}_i^2$ is the unbiased estimator of the variance of 864the errors for individual $i$ obtained from the preliminary estimation 865and: 866 867\begin{equation*} 868 \hat{\Delta}=\frac{1}{n-1}\sum_{i=1}^n\left(\hat{\beta}_i-\frac{1}{n}\sum_{i=1}^n\hat{\beta}_i\right) 869 \left(\hat{\beta}_i-\frac{1}{n}\sum_{i=1}^n\hat{\beta}_i\right)^\top -\frac{1}{n}\sum_{i=1}^n\hat{\sigma}_i^2(X_i^\top X_i)^{-1} 870\end{equation*} 871 872If this matrix is not positive-definite, the second term is dropped. 873 874With the `Grunfeld` data, we get: 875 876```{r grunfeld.within} 877grun.varw <- pvcm(inv~value+capital, data=Grunfeld, model="within") 878grun.varr <- pvcm(inv~value+capital, data=Grunfeld, model="random") 879summary(grun.varr) 880``` 881 882## Generalized method of moments estimator 883 884The generalized method of moments is mainly used in panel data 885econometrics to estimate dynamic models [@AREL:BOND:91; @HOLT:NEWE:ROSE:88]. 886 887\begin{equation*} 888y_{it}=\rho y_{it-1}+\beta^\top x_{it}+\mu_i+\epsilon_{it} 889\end{equation*} 890 891The model is first differenced to get rid of the individual effect: 892 893\begin{equation*} 894\Delta y_{it}=\rho \Delta y_{it-1}+\beta^\top \Delta x_{it}+\Delta \epsilon_{it} 895\end{equation*} 896 897Least squares are inconsistent because $\Delta \epsilon_{it}$ is 898correlated with $\Delta y_{it-1}$. $y_{it-2}$ is a valid, but weak 899instrument (see @ANDE:HSIA:81). The GMM estimator uses the fact that 900the number of valid instruments is growing with $t$: 901 902- $t=3$: $y_1$, 903- $t=4$: $y_1,y_2$, 904- $t=5$: $y_1,y_2,y_3$. 905 906For individual $i$, the matrix of instruments is then: 907 908\begin{equation*} 909W_i=\left( 910\begin{array}{ccccccccccccc} 911y_1 & 0 & 0 & 0 & 0 & 0 & ... & 0 & 0 & 0 & 0 & x_{i3} \\ 9120 & y_1 & y_2 & 0 & 0 & 0 & ... & 0 & 0 & 0 & 0 & x_{i4} \\ 9130 & 0 & 0 & y_1 & y_2 & y_3 & ... & 0 & 0 & 0 & 0 & x_{i5} \\ 914\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 9150 & 0 & 0 & 0 & ... & ... & ... & y_1 & y_2 & ... & y_{t-2} & x_{iT-2} &\\ 916\end{array} 917\right) 918\end{equation*} 919 920The moment conditions are: $\sum_{i=1}^n W_i^\top e_i(\beta)$ where 921$e_i(\beta)$ is the vector of residuals for individual $i$. The GMM 922estimator minimizes: 923 924\begin{equation*} 925\left(\sum_{i=1}^n e_i(\beta)^\top W_i\right) A \left(\sum_{i=1}^n W_i^\top e_i(\beta)\right) 926\end{equation*} 927 928where $A$ is the weighting matrix of the moments. 929 930One-step estimators are computed using a known weighting matrix. For 931the model in first differences, one uses: 932 933\begin{equation*} 934A^{(1)}=\left(\sum_{i=1}^n W_i^\top H^{(1)}W_i\right)^{-1} 935\end{equation*} 936 937with: 938 939\begin{equation*} 940H^{(1)}=d^\top d=\left( 941\begin{array}{ccccc} 9422 & -1 & 0 & ... & 0\\ 943-1 & 2 & -1 & ... & 0\\ 9440 & -1 & 2 & ... & 0\\ 945\vdots & \vdots & \vdots & \vdots & \vdots \\ 9460 & 0 & 0 & -1 & 2\\ 947\end{array} 948\right) 949\end{equation*} 950 951Two-steps estimators are obtained using 952$H^{(2)}_i=\sum_{i=1}^n e^{(1)}_i e^{(1)\top }_i$ 953where $e_i^{(1)}$ are the residuals of the one step estimate. 954 955 956@BLUN:BOND:98 show that with weak hypothesis on the data 957generating process, supplementary moment conditions exist for the 958equation in level: 959 960$$ 961y_{it} = \gamma y_{it-1}+\mu_i+\eta_{it} 962$$ 963 964More precisely, they show that $\Delta y_{it-2}=y_{it-2}-y_{it-3}$ is 965a valid instrument. The estimator is obtained using the residual 966vector in difference and in level: 967 968$$ 969e^+_i=(\Delta e_i, e_i) 970$$ 971 972and the matrix of augmented moments: 973 974$$ 975Z_i^+=\left( 976\begin{array}{ccccc} 977Z_i & 0 & 0 & ... & 0 \\ 9780 & \Delta y_{i2} & 0 & ... & 0 \\ 9790 & 0 & \Delta y_{i3} & ... & 0 \\ 9800 & 0 & 0 & ... & \Delta y_{iT-1} 981\end{array} 982\right) 983$$ 984 985The moment conditions are then 986 987 988\begin{eqnarray*} 989\left(\sum_{i=1}^n Z_i^{+\top} \left(\begin{array}{c}\bar{e}_i(\beta)\\ e_i(\beta)\end{array}\right)\right)^\top = \left(\sum_{i=1}^n y_{i1} \bar{e}_{i3},\sum_{i=1}^n y_{i1}\bar{e}_{i4},\sum_{i=1}^n y_{i2}\bar{e}_{i4}, ..., \right.\\ \left. \sum_{i=1}^n y_{i1} \bar{e}_{iT}, \sum_{i=1}^n y_{i2} \bar{e}_{iT}, ...,\sum_{i=1}^n y_{iT-2} \bar{e}_{iT}, \sum_{i=1}^n \sum_{t=3}^T x_{it} \bar{e}_{it}\right.\\ 990\left.\sum_{i=1}^n e_{i3} \Delta y_{i2}, \sum_{i=1}^n e_{i4} \Delta y_{i3}, ... , \sum_{i=1}^n e_{iT} \Delta y_{iT-1} \right)^\top 991\end{eqnarray*} 992 993 994The GMM estimator is provided by the `pgmm` 995function. By using a multi-part formula, the variables of the model and 996the lag structure are described. 997 998In a GMM estimation, there are "normal instruments" and 999"GMM instruments". GMM instruments are indicated 1000in the second part of the formula. By default, all the variables of 1001the model that are not used as GMM instruments are used as 1002normal instruments, with the same lag structure; "normal" 1003instruments may also be indicated in the third part of the formula. 1004 1005The `effect` argument is either `NULL`, `"individual"` 1006(the default), or `"twoways"`. In the first case, the model is 1007estimated in levels. In the second case, the model is estimated in 1008first differences to get rid of the individuals effects. In the last 1009case, the model is estimated in first differences and time dummies are 1010included. 1011 1012The `model` argument specifies whether a one-step or a 1013two-steps model is requested (`"onestep"` or 1014`"twosteps"`). 1015 1016The following example is from @AREL:BOND:91. Employment is 1017explained by past values of employment (two lags), current and first 1018lag of wages and output and current value of capital. 1019 1020```{r gmm} 1021emp.gmm <- pgmm(log(emp)~lag(log(emp), 1:2)+lag(log(wage), 0:1)+log(capital)+ 1022 lag(log(output), 0:1) | lag(log(emp), 2:99), 1023 data = EmplUK, effect = "twoways", model = "twosteps") 1024summary(emp.gmm) 1025``` 1026 1027The following example is from @BLUN:BOND:98. The "sys" 1028estimator is obtained using `transformation = "ld"` for level 1029and difference. The `robust` argument of the `summary` 1030method enables to use the robust covariance matrix proposed by 1031@WIND:05. For all pgmm models, `robust = TRUE` is the default (but set 1032in this example explicitly). 1033 1034```{r gmm2} 1035z2 <- pgmm(log(emp) ~ lag(log(emp), 1)+ lag(log(wage), 0:1) + 1036 lag(log(capital), 0:1) | lag(log(emp), 2:99) + 1037 lag(log(wage), 2:99) + lag(log(capital), 2:99), 1038 data = EmplUK, effect = "twoways", model = "onestep", 1039 transformation = "ld") 1040summary(z2, robust = TRUE) 1041``` 1042 1043## General FGLS models 1044 1045General FGLS estimators are based on a two-step estimation 1046process: first an OLS model is estimated, then its residuals 1047$\hat{u}_{it}$ are used to estimate an error covariance matrix 1048more general than the random effects one for use in a 1049feasible-GLS analysis. Formally, the estimated error 1050covariance matrix is $\hat{V}=I_n \otimes \hat{\Omega}$, 1051with $$\hat{\Omega}=\sum_{i=1}^n \frac{\hat{u}_{it} 1052 \hat{u}_{it}^\top }{n} $$ (see @WOOL:02 10.4.3 and 10.5.5). 1053 1054 1055This framework allows the error covariance structure inside every 1056group (if `effect = "individual"`) of observations to be fully 1057unrestricted and is therefore robust against any type of intragroup 1058heteroskedasticity and serial correlation. This structure, by 1059converse, is assumed identical across groups and thus general 1060FGLS is inefficient under groupwise 1061heteroskedasticity. Cross-sectional correlation is excluded a priori. 1062 1063Moreover, the number of variance parameters to be estimated with $N=n\times T$ 1064data points is $T(T+1)/2$, which makes these estimators particularly 1065suited for situations where $n>>T$, as e.g., in labour or household 1066income surveys, while problematic for "long" panels, where $\hat{V}$ 1067tends to become singular and standard errors therefore become biased 1068downwards. 1069 1070In a pooled time series context (`effect = "time"`), 1071symmetrically, this estimator is able to account for arbitrary 1072cross-sectional correlation, provided that the latter is 1073time-invariant (see @GREE:03 13.9.1--2, pp. 321--2). In this case 1074serial correlation has to be assumed away and the estimator is 1075consistent with respect to the time dimension, keeping $n$ fixed. 1076 1077The function `pggls` estimates general FGLS models, with either 1078fixed or "random" effects^[The "random effect" is better termed "general FGLS" 1079model, as in fact it does not have a proper random effects structure, but we 1080keep this terminology for general language consistency.]. 1081 1082The "random effect" general FGLS is estimated by: 1083 1084```{r pggls} 1085zz <- pggls(log(emp)~log(wage)+log(capital), data=EmplUK, model="pooling") 1086summary(zz) 1087``` 1088 1089The fixed effects `pggls` (see @WOOL:02, p. 276) is based 1090on the estimation of a within model in the first step; the rest follows as 1091above. It is estimated by: 1092 1093```{r } 1094zz <- pggls(log(emp)~log(wage)+log(capital), data=EmplUK, model="within") 1095``` 1096 1097The `pggls` function is similar to `plm` in many 1098respects. An exception is that the estimate of the group covariance 1099matrix of errors (`zz$sigma`, a matrix, not shown) is reported in 1100the model objects instead of the usual estimated variances of the two 1101error components. 1102 1103 1104# Tests{#tests} 1105 1106As sketched in Section [linear panel model](#linear-panel-model), specification 1107testing in panel models involves essentially testing for poolability, 1108for individual or time unobserved effects and for correlation between 1109these latter and the regressors (Hausman-type tests). As for the other 1110usual diagnostic checks, we provide a suite of serial correlation 1111tests, while not touching on the issue of heteroskedasticity 1112testing. Instead, we provide heteroskedasticity-robust covariance 1113estimators, to be described in 1114subsection [robust covariance matrix estimation](#robust). 1115 1116## Tests of poolability 1117 1118`pooltest` tests the hypothesis that the same coefficients 1119apply to each individual. It is a standard F test, based on the 1120comparison of a model obtained for the full sample and a model based 1121on the estimation of an equation for each individual. The first 1122argument of `pooltest` is a `plm` object. The second 1123argument is a `pvcm` object obtained with 1124`model="within"`. If the first argument is a pooling model, the 1125test applies to all the coefficients (including the intercepts), if it 1126is a within model, different intercepts are assumed. 1127 1128To test the hypothesis that all the coefficients in the 1129`Grunfeld` example, excluding the intercepts, are equal, we use 1130: 1131 1132```{r } 1133znp <- pvcm(inv~value+capital, data=Grunfeld, model="within") 1134zplm <- plm(inv~value+capital, data=Grunfeld, model="within") 1135pooltest(zplm, znp) 1136``` 1137 1138The same test can be computed using a formula as first argument of the 1139`pooltest` function: 1140 1141```{r results='hide'} 1142pooltest(inv~value+capital, data=Grunfeld, model="within") 1143``` 1144 1145## Tests for individual and time effects 1146 1147 1148`plmtest` implements Lagrange multiplier tests of individual 1149or/and time effects based on the results of the pooling model. Its 1150main argument is a `plm` object (the result of a pooling model) 1151or a formula. 1152 1153Two additional arguments can be added to indicate the kind of test to 1154be computed. The argument `type` is one of: 1155 1156- `"honda"`: @HOND:85, the default value, 1157- `"bp"`: @BREU:PAGA:80, 1158- `"kw"`: @KING:WU:97^[NB: Oneway King-Wu (`"kw"`) statistics 1159(`"individual"` and `"time"`) coincide with the respective Honda statistics 1160(`"honda"`); however, the twoway statistics of `"kw"` and `"honda"` differ.], 1161- `"ghm"`: @GOUR:HOLL:MONF:82. 1162 1163The effects tested are indicated with the `effect` argument (one of 1164`"individual"`, `"time"`, or `"twoways"`). The test statistics 1165implemented are also suitable for unbalanced panels.^[The `"bp"` test 1166for unbalanced panels was derived in @BALT:LI:90, the `"kw"` test for 1167unbalanced panels in @BALT:CHAN:LI:98. The `"ghm"` test and the `"kw"` 1168test were extended to two--way effects in @BALT:CHAN:LI:92. For a 1169concise overview of all these statistics see @BALT:13 Sec. 4.2, 1170pp. 68--76 (for balanced panels) and Sec. 9.5, pp. 200--203 (for 1171unbalanced panels) or @BALT:21, Sec. 4.2, pp. 81-84 (balanced), Sec. 9.6, 1172pp. 243-246 (unbalanced).] 1173 1174To test the presence of individual and time effects in the 1175`Grunfeld` example, using the @GOUR:HOLL:MONF:82 test, 1176we use: 1177 1178```{r } 1179g <- plm(inv ~ value + capital, data=Grunfeld, model="pooling") 1180plmtest(g, effect="twoways", type="ghm") 1181``` 1182 1183or 1184 1185```{r results='hide'} 1186plmtest(inv~value+capital, data=Grunfeld, effect="twoways", type="ghm") 1187``` 1188 1189 1190 1191`pFtest` computes F tests of effects based on the comparison of 1192the within and the pooling model. Its main arguments 1193are either two `plm` objects (a pooling and a within model) or a formula. 1194 1195```{r } 1196gw <- plm(inv ~ value + capital, data=Grunfeld, effect="twoways", model="within") 1197gp <- plm(inv ~ value + capital, data=Grunfeld, model="pooling") 1198pFtest(gw, gp) 1199``` 1200 1201```{r results='hide'} 1202pFtest(inv~value+capital, data=Grunfeld, effect="twoways") 1203``` 1204 1205## Hausman test 1206 1207`phtest` computes the Hausman test which is based on the 1208comparison of two sets of estimates (see @HAUS:78). Its main 1209arguments are two `panelmodel` objects or a 1210formula. A classical application of the Hausman test for panel data 1211is to compare the fixed and the random effects models: 1212 1213```{r } 1214gw <- plm(inv~value+capital, data=Grunfeld, model="within") 1215gr <- plm(inv~value+capital, data=Grunfeld, model="random") 1216phtest(gw, gr) 1217``` 1218 1219## Tests of serial correlation{#serialcor} 1220 1221 1222A model with individual effects has composite errors that are serially 1223correlated by definition. The presence of the time-invariant error 1224component^[Here we treat fixed and random effects alike, as components 1225of the error term, according with the modern approach in econometrics 1226(see @WOOL:02, @WOOL:10).] gives rise to serial correlation which does 1227not die out over time, thus standard tests applied on pooled data 1228always end up rejecting the null of spherical residuals^[Neglecting 1229time effects may also lead to serial correlation in residuals (as 1230observed in @WOOL:02 10.4.1).]. There may also be serial correlation 1231of the "usual" kind in the idiosyncratic error terms, e.g., as an AR(1) 1232process. By "testing for serial correlation" we mean testing for this 1233latter kind of dependence. 1234 1235For these reasons, the subjects of testing for individual error 1236components and for serially correlated idiosyncratic errors are 1237closely related. In particular, simple (*marginal*) tests for one 1238direction of departure from the hypothesis of spherical errors usually 1239have power against the other one: in case it is present, they are 1240substantially biased towards rejection. *Joint* tests are 1241correctly sized and have power against both directions, but usually do 1242not give any information about which one actually caused 1243rejection. *Conditional* tests for serial correlation that take 1244into account the error components are correctly sized under presence 1245of both departures from sphericity and have power only against the 1246alternative of interest. While most powerful if correctly specified, 1247the latter, based on the likelihood framework, are crucially dependent 1248on normality and homoskedasticity of the errors. 1249 1250In `plm` we provide a number of joint, marginal and conditional 1251ML-based tests, plus some semiparametric alternatives which are 1252robust vs. heteroskedasticity and free from distributional 1253assumptions. 1254 1255### Unobserved effects test 1256 1257The unobserved effects test *à la Wooldridge* 1258(see @WOOL:02 10.4.4), is a semiparametric test for the null 1259hypothesis that $\sigma^2_{\mu}=0$, i.e. that there are no unobserved 1260effects in the residuals. Given that under the null the covariance 1261matrix of the residuals for each individual is diagonal, the test 1262statistic is based on the average of elements in the upper (or lower) 1263triangle of its estimate, diagonal excluded: $n^{-1/2} \sum_{i=1}^n 1264\sum_{t=1}^{T-1} \sum_{s=t+1}^T \hat{u}_{it} \hat{u}_{is}$ (where 1265$\hat{u}$ are the pooled OLS residuals), which must be 1266"statistically close" to zero under the null, scaled by its standard 1267deviation: $$W=\frac{ \sum_{i=1}^n \sum_{t=1}^{T-1} \sum_{s=t+1}^T 1268 \hat{u}_{it} \hat{u}_{is} }{ [{ \sum_{i=1}^n ( \sum_{t=1}^{T-1} 1269 \sum_{s=t+1}^T \hat{u}_{it} \hat{u}_{is} } )^2 ]^{1/2} }$$ 1270 1271 1272This test is ($n$-) asymptotically distributed as a standard normal 1273regardless of the distribution of the errors. It does also not rely on 1274homoskedasticity. 1275 1276It has power both against the standard random effects specification, 1277where the unobserved effects are constant within every group, as well 1278as against any kind of serial correlation. As such, it "nests" both 1279random effects and serial correlation tests, trading some power 1280against more specific alternatives in exchange for robustness. 1281 1282While not rejecting the null favours the use of pooled OLS, rejection 1283may follow from serial correlation of different kinds, and in 1284particular, quoting @WOOL:02, "should not be interpreted as 1285implying that the random effects error structure *must* be true". 1286 1287Below, the test is applied to the data and model in @MUNN:90: 1288 1289```{r wtest} 1290pwtest(log(gsp)~log(pcap)+log(pc)+log(emp)+unemp, data=Produc) 1291``` 1292 1293### Locally robust tests for serial correlation or random effects 1294 1295The presence of random effects may affect tests for residual serial 1296correlation, and the opposite. One solution is to use a joint test, 1297which has power against both alternatives. A joint LM test for random 1298effects *and* serial correlation under normality and 1299homoskedasticity of the idiosyncratic errors has been derived by 1300@BALT:LI:91 and @BALT:LI:95 and is implemented as an 1301option in `pbsytest`: 1302 1303```{r pbsytestJoint} 1304pbsytest(log(gsp)~log(pcap)+log(pc)+log(emp)+unemp, data=Produc, test="j") 1305``` 1306 1307Rejection of the joint test, though, gives no information on the 1308direction of the departure from the null hypothesis, i.e.: is 1309rejection due to the presence of serial correlation, of random effects 1310or of both? 1311 1312@BERA:SOSA:YOON:01 (hereafter BSY) derive locally robust tests both 1313for individual random effects and for first-order serial correlation 1314in residuals as "corrected" versions of the standard LM test (see 1315`plmtest`). While still dependent on normality and 1316homoskedasticity, these are robust to *local* departures from the 1317hypotheses of, respectively, no serial correlation or no random 1318effects. The authors observe that, although suboptimal, these tests 1319may help detecting the right direction of the departure from the null, 1320thus complementing the use of joint tests. Moreover, being based on 1321pooled OLS residuals, the BSY tests are computationally far less 1322demanding than likelihood-based conditional tests. 1323 1324On the other hand, the statistical properties of these "locally 1325corrected" tests are inferior to those of the non-corrected 1326counterparts when the latter are correctly specified. If there is no 1327serial correlation, then the optimal test for random effects is the 1328likelihood-based LM test of Breusch and Godfrey (with refinements by 1329Honda, see `plmtest`), while if there are no random effects the 1330optimal test for serial correlation is, again, Breusch-Godfrey's 1331test^[$LM_3$ in @BALT:LI:95.]. If the presence of a 1332random effect is taken for granted, then the optimal test for serial 1333correlation is the likelihood-based conditional LM test of 1334@BALT:LI:95 (see `pbltest`). 1335 1336The serial correlation version is the default: 1337 1338```{r pbsytestAR} 1339pbsytest(log(gsp)~log(pcap)+log(pc)+log(emp)+unemp, data=Produc) 1340``` 1341 1342The BSY test for random effects is implemented in the one-sided 1343version^[Corresponding to $RSO^*_{\mu}$ in the original paper.], 1344which takes heed that the variance of the random effect 1345must be non-negative: 1346 1347```{r pbsytestRE} 1348pbsytest(log(gsp)~log(pcap)+log(pc)+log(emp)+unemp, data=Produc, test="re") 1349``` 1350 1351### Conditional LM test for AR(1) or MA(1) errors under random effects 1352 1353@BALT:LI:91 and @BALT:LI:95 derive a Lagrange multiplier test for 1354serial correlation in the idiosyncratic component of the errors under 1355(normal, heteroskedastic) random effects. Under the null of 1356serially uncorrelated errors, the test turns out to be identical for 1357both the alternative of AR(1) and MA(1) processes. One- and two-sided 1358versions are provided, the one-sided having power against positive 1359serial correlation only. The two-sided is the default, while for the 1360other one must specify the `alternative` option to 1361`"onesided"`: 1362 1363```{r pbltest} 1364pbltest(log(gsp)~log(pcap)+log(pc)+log(emp)+unemp, 1365 data=Produc, alternative="onesided") 1366``` 1367 1368As usual, the LM test statistic is based on residuals from the maximum 1369likelihood estimate of the restricted model (random effects with 1370serially uncorrelated errors). In this case, though, the restricted 1371model cannot be estimated by OLS anymore, therefore the testing 1372function depends on `lme()` in the `nlme` package for 1373estimation of a random effects model by maximum likelihood. For this 1374reason, the test is applicable only to balanced panels. 1375 1376No test has been implemented to date for the symmetric hypothesis of 1377no random effects in a model with errors following an AR(1) process, 1378but an asymptotically equivalent likelihood ratio test is available in 1379the `nlme` package (see Section [plm versus nlme and lme4](#nlme)). 1380 1381### General serial correlation tests 1382 1383A general testing procedure for serial correlation in fixed effects 1384(FE), random effects (RE) and pooled-OLS 1385panel models alike can be based on considerations in @WOOL:02, 10.7.2. 1386 1387Recall that `plm` model objects are the result of OLS 1388estimation performed on "demeaned" data, where, in the case of 1389individual effects (else symmetric), this means time-demeaning for the 1390FE (`within`) model, quasi-time-demeaning for the RE 1391(`random`) model and original data, with no demeaning at all, for 1392the pooled OLS (`pooling`) model (see Section [software approach](#software-approach)). 1393 1394For the random effects model, @WOOL:02 observes that 1395under the null of homoskedasticity and no serial correlation in the 1396idiosyncratic errors, the residuals from the quasi-demeaned regression 1397must be spherical as well. Else, as the individual effects are wiped 1398out in the demeaning, any remaining serial correlation must be due to 1399the idiosyncratic component. Hence, a simple way of testing for serial 1400correlation is to apply a standard serial correlation test to the 1401quasi-demeaned model. The same applies in a pooled model, w.r.t. the 1402original data. 1403 1404The FE case needs some qualification. It is well-known that if the 1405original model's errors are uncorrelated then FE residuals are 1406negatively serially correlated, with $cor(\hat{u}_{it}, 1407\hat{u}_{is})=-1/(T-1)$ for each $t,s$ (see @WOOL:02 10.5.4). This 1408correlation clearly dies out as T increases, so this kind of AR test 1409is applicable to `within` model objects only for T "sufficiently 1410large"^[Baltagi and Li derive a basically analogous T-asymptotic test 1411for first-order serial correlation in a FE panel model as a 1412Breusch-Godfrey LM test on within residuals (see @BALT:LI:95 par. 2.3 1413and formula 12). They also observe that the test on within residuals 1414can be used for testing on the RE model, as "the within transformation 1415[time-demeaning, in our terminology] wipes out the individual effects, 1416whether fixed or random". Generalizing the Durbin-Watson test to FE 1417models by applying it to fixed effects residuals is documented in 1418@BHAR:FRAN:NARE:82, a (modified) version for unbalanced and/or 1419non-consecutive panels is implemented in `pbnftest` as is Baltagi-Wu's 1420LBI statistic (for both see @BALT:WU:99).]. On the converse, in short 1421panels the test gets severely biased towards rejection (or, as the 1422induced correlation is negative, towards acceptance in the case of the 1423one-sided DW test with `alternative="greater"`). See below for a 1424serial correlation test applicable to "short" FE panel models. 1425 1426 1427`plm` objects retain the "demeaned" data, so the procedure is 1428straightforward for them. The wrapper functions `pbgtest` and 1429`pdwtest` re-estimate the relevant quasi-demeaned model by 1430OLS and apply, respectively, standard 1431Breusch-Godfrey and Durbin-Watson tests from package `lmtest`: 1432 1433```{r generalAR} 1434pbgtest(grun.fe, order = 2) 1435``` 1436 1437The tests share the features of their OLS counterparts, in particular 1438the `pbgtest` allows testing for higher-order serial correlation, 1439which might turn useful, e.g., on quarterly data. Analogously, from 1440the point of view of software, as the functions are simple wrappers 1441towards `bgtest` and `dwtest`, all arguments from the latter two apply 1442and may be passed on through the ellipsis (the `...` argument). 1443 1444 1445### Wooldridge's test for serial correlation in "short" FE panels 1446 1447For the reasons reported above, under the null of no serial 1448correlation in the errors, the residuals of a FE model must be 1449negatively serially correlated, with $cor(\hat{\epsilon}_{it}, 1450\hat{\epsilon}_{is})=-1/(T-1)$ for each $t,s$. Wooldridge suggests basing a 1451test for this null hypothesis on a pooled regression of FE residuals 1452on themselves, lagged one period: $$\hat{\epsilon}_{i,t}=\alpha + \delta 1453\hat{\epsilon}_{i,t-1} + \eta_{i,t}$$ Rejecting the restriction $\delta = 1454-1/(T-1)$ makes us conclude against the original null of no serial 1455correlation. 1456 1457The building blocks available in `plm` make it easy to 1458construct a function carrying out this procedure: first the 1459FE model is estimated and the residuals retrieved, then they 1460are lagged and a `pooling` AR(1) model is estimated. The test 1461statistic is obtained by applying the above restriction on 1462$\delta$ and supplying a heteroskedasticity- and autocorrelation-consistent 1463covariance matrix (`vcovHC` with the appropriate options, in particular 1464`method="arellano"`)^[see subsection [robust covariance matrix estimation](#robust).]. 1465 1466```{r pwartest} 1467pwartest(log(emp) ~ log(wage) + log(capital), data=EmplUK) 1468``` 1469 1470The test is applicable to any FE panel model, and in particular to 1471"short" panels with small $T$ and large $n$. 1472 1473 1474 1475### Wooldridge's first-difference-based test 1476 1477In the context of the first difference model, 1478@WOOL:02, 10.6.3 proposes a serial correlation test that can 1479also be seen as a specification test to choose the most efficient 1480estimator between fixed effects (`within`) and first difference 1481(`fd`). 1482 1483The starting point is the observation that if the idiosyncratic errors 1484of the original model $u_{it}$ are uncorrelated, the errors of the 1485(first) differenced model^[Here, $e_{it}$ for notational simplicity (and as in 1486Wooldridge): equivalent to $\Delta \epsilon_{it}$ in the general notation of 1487the paper.] 1488$e_{it} \equiv 1489u_{it}-u_{i,t-1}$ will be correlated, with $cor(e_{it}, 1490e_{i,t-1})=-0.5$, while any time-invariant effect, "fixed" or 1491"random", is wiped out in the differencing. So a serial correlation 1492test for models with individual effects of any kind can be based on 1493estimating the model $$\hat{u}_{i,t}= \delta \hat{u}_{i,t-1} + 1494\eta_{i,t}$$ and testing the restriction $\delta = -0.5$, 1495corresponding to the null of no serial correlation. @DRUK:03 1496provides Monte Carlo evidence of the good empirical properties of the 1497test. 1498 1499On the other extreme (see @WOOL:02 10.6.1), if the differenced 1500errors $e_{it}$ are uncorrelated, as by definition $u_{it} = u_{i,t-1} 1501+ e_{it}$, then $u_{it}$ is a random walk. In this latter case, the 1502most efficient estimator is the first difference (`fd`) one; in 1503the former case, it is the fixed effects one (`within`). 1504 1505The function `pwfdtest` allows testing either hypothesis: the 1506default behaviour `h0="fd"` is to test for serial correlation in 1507*first-differenced* errors: 1508 1509```{r pwfdtest1} 1510pwfdtest(log(emp) ~ log(wage) + log(capital), data=EmplUK) 1511``` 1512 1513while specifying `h0="fe"` the null hypothesis becomes no serial 1514correlation in *original* errors, which is similar to the `pwartest`. 1515 1516```{r pwfdtest2} 1517pwfdtest(log(emp) ~ log(wage) + log(capital), data=EmplUK, h0="fe") 1518``` 1519 1520Not rejecting one of the two is evidence in favour of using the 1521estimator corresponding to `h0`. Should the truth lie in the 1522middle (both rejected), whichever estimator is chosen will have 1523serially correlated errors: therefore it will be advisable to use the 1524autocorrelation-robust covariance estimators from the 1525subsection [robust covariance matrix estimation](#robust) in inference. 1526 1527 1528## Tests for cross-sectional dependence 1529 1530Next to the more familiar issue of serial correlation, over the last 1531years a growing body of literature has been dealing with 1532cross-sectional dependence (henceforth: XSD) in panels, which 1533can arise, e.g., if individuals respond to common shocks (as in the 1534literature on *factor models*) or if spatial diffusion processes 1535are present, relating individuals in a way depending on a measure of 1536distance (*spatial models*). 1537 1538The subject is huge, and here we touch only some general aspects of 1539misspecification testing and valid inference. If XSD is 1540present, the consequence is, at a minimum, inefficiency of the usual 1541estimators and invalid inference when using the standard covariance 1542matrix^[This is the case, e.g., if in an unobserved effects model when XSD is 1543due to an unobservable factor structure, with factors that are uncorrelated 1544with the regressors. In this case the within or random estimators are still 1545consistent, although inefficient (see @DEHO:SARA:06).]. 1546The plan is to have in `plm` both misspecification tests to detect 1547XSD and robust covariance matrices to perform valid inference 1548in its presence, like in the serial dependence case. For now, though, 1549only misspecification tests are included. 1550 1551 1552### CD and LM-type tests for global cross-sectional dependence 1553 1554The function `pcdtest` implements a family of XSD 1555tests which can be applied in different settings, ranging from those 1556where $T$ grows large with $n$ fixed to "short" panels with a big 1557$n$ dimension and a few time periods. All are based on 1558(transformations of--) the product-moment correlation coefficient of a 1559model's residuals, defined as 1560 1561$$ \hat{\rho}_{ij}=\frac{\sum_{t=1}^T \hat{u}_{it} \hat{u}_{jt}}{(\sum_{t=1}^T \hat{u}^2_{it})^{1/2} (\sum_{t=1}^T \hat{u}^2_{jt})^{1/2} } $$ 1562 1563i.e., as averages over the time dimension of pairwise correlation 1564coefficients for each pair of cross-sectional units. 1565 1566The Breusch-Pagan [@BREU:PAGA:80] LM test, based on the squares of 1567$\rho_{ij}$, is valid for $T \rightarrow \infty$ with $n$ fixed; 1568defined as 1569 1570$$LM=\sum_{i=1}^{n-1} \sum_{j=i+1}^{n} T_{ij} \hat{\rho}_{ij}^2$$ 1571 1572where in the case of an unbalanced panel only pairwise complete 1573observations are considered, and $T_{ij}=min(T_i,T_j)$ with $T_i$ 1574being the number of observations for individual $i$; else, if the 1575panel is balanced, $T_{ij}=T$ for each $i,j$. The test is distributed 1576as $\chi^2_{n(n-1)/2}$. It is inappropriate whenever the $n$ dimension 1577is "large". A scaled version, applicable also if $T \rightarrow 1578\infty$ and *then* $n \rightarrow \infty$ (as in some pooled time 1579series contexts), is defined as 1580 1581$$SCLM=\sqrt{\frac{1}{n(n-1)}} ( \sum_{i=1}^{n-1} \sum_{j=i+1}^{n} T_{ij} \hat{\rho}_{ij}^2 -1 )$$ 1582 1583and distributed as a standard normal (see @PESA:04). 1584 1585A bias-corrected scaled version, $BCSCLM$, for the *fixed effect model with individual 1586effects* only is also available which is simply the $SCLM$ with a term correcting 1587for the bias (@BALT:FENG:KAO:12)^[The unbalanced version of this statistic uses 1588max(Tij) for T in the bias-correction term.]. This statistic is also asymptotically 1589distributed as standard normal. 1590$$BCSCLM=\sqrt{\frac{1}{n(n-1)}} ( \sum_{i=1}^{n-1} \sum_{j=i+1}^{n} T_{ij} \hat{\rho}_{ij}^2 -1)-\frac{n}{2(T-1)}$$ 1591 1592 1593 1594 1595Pesaran's (@PESA:04, @PESA:15) $CD$ test 1596 1597$$CD=\sqrt{\frac{2}{n(n-1)}} ( \sum_{i=1}^{n-1} \sum_{j=i+1}^{n} \sqrt{T_{ij}} \hat{\rho}_{ij} )$$ 1598 1599based on $\rho_{ij}$ without squaring (also distributed as a standard 1600normal) is appropriate both in $n$-- and in $T$--asymptotic 1601settings. It has remarkable properties in samples of any practically 1602relevant size and is robust to a variety of settings. The only big 1603drawback is that the test loses power against the alternative of 1604cross-sectional dependence if the latter is due to a factor structure 1605with factor loadings averaging zero, that is, some units react 1606positively to common shocks, others negatively. 1607 1608The default version of the test is `"cd"` yielding Pesaran's $CD$ test. 1609These tests are originally meant to use the residuals of separate 1610estimation of one time-series regression for each cross-sectional 1611unit, so this is the default behaviour of `pcdtest`. 1612 1613```{r pcdtest1} 1614pcdtest(inv~value+capital, data=Grunfeld) 1615``` 1616 1617If a different model specification (`within`, `random`, ...) 1618is assumed consistent, one can resort to its residuals for 1619testing^[This is also the only solution when the time dimension's length is 1620insufficient for estimating the heterogeneous model.] 1621by specifying the relevant `model` type. The main 1622argument of this function may be either a model of class 1623`panelmodel` or a `formula` and a `data.frame`; in the 1624second case, unless `model` is set to `NULL`, all usual 1625parameters relative to the estimation of a `plm` model may be 1626passed on. The test is compatible with any consistent 1627`panelmodel` for the data at hand, with any specification of 1628`effect`. E.g., specifying `effect = "time"` or 1629`effect = "twoways"` allows to test for residual cross-sectional 1630dependence after the introduction of time fixed effects to account for 1631common shocks. 1632 1633```{r pcdtest2} 1634pcdtest(inv~value+capital, data=Grunfeld, model="within") 1635``` 1636 1637 1638 1639If the time dimension is insufficient and `model=NULL`, the 1640function defaults to estimation of a `within` model and issues a 1641warning. 1642 1643 1644### CD(p) test for local cross-sectional dependence 1645 1646A *local* variant of the $CD$ test, called $CD(p)$ test 1647[@PESA:04], takes into account an appropriate subset of *neighbouring* 1648cross-sectional units to check the null of no XSD against the alternative 1649of *local* XSD, i.e. dependence between neighbours only. To do so, the pairs 1650of neighbouring units are selected by means of a binary proximity matrix 1651like those used in spatial models. In the original paper, a regular 1652ordering of observations is assumed, so that the $m$-th 1653cross-sectional observation is a neighbour to the $(m-1)$-th and to 1654the $(m+1)$-th. Extending the $CD(p)$ test to irregular lattices, we 1655employ the binary proximity matrix as a selector for discarding the 1656correlation coefficients relative to pairs of observations that are 1657not neighbours in computing the $CD$ statistic. The test is then 1658defined as 1659 1660$$CD=\sqrt{\frac{1}{\sum_{i=1}^{n-1} \sum_{j=i+1}^{n} w(p)_{ij}}} ( \sum_{i=1}^{n-1} \sum_{j=i+1}^{n} [w(p)]_{ij} \sqrt{T_{ij}}\hat{\rho}_{ij} )$$ 1661 1662where $[w(p)]_{ij}$ is the $(i,j)$-th element of the $p$-th order 1663proximity matrix, so that if $h,k$ are not neighbours, $[w(p)]_{hk}=0$ 1664and $\hat{\rho}_{hk}$ gets "killed"; this is easily seen to reduce 1665to formula (14) in Pesaran [@PESA:04] for the special case 1666considered in that paper. The same can be applied to the $LM$, 1667$SCLM$, and $BCSCLM$ tests. 1668 1669Therefore, the *local* version of either test can be computed 1670supplying an $n \times n$ matrix (of any kind coercible to 1671`logical`), providing information on whether any pair of 1672observations are neighbours or not, to the `w` argument. If 1673`w` is supplied, only neighbouring pairs will be used in 1674computing the test; else, `w` will default to `NULL` and all 1675observations will be used. The matrix needs not really be binary, so 1676commonly used "row-standardized" matrices can be employed as well: 1677it is enough that neighbouring pairs correspond to nonzero elements in 1678`w` ^[The very comprehensive package `spdep` for spatial dependence analysis 1679(see @BIVA:08) contains features for creating, lagging and manipulating 1680*neighbour list* objects of class `nb`, that can be readily converted to and 1681from proximity matrices by means of the `nb2mat` function. Higher orders of 1682the $CD(p)$ test can be obtained by lagging the corresponding `nb`s 1683through `nblag`.]. 1684 1685## Panel unit root tests 1686 1687### Overview of functions for panel unit root testing 1688 1689 1690Below, first an overview is provided which tests are implemented per functions. 1691A theoretical treatment is given for a few of those tests later on. The package 1692`plm` offers several panel unit root tests contained in three functions: 1693 1694- `purtest` (Levin-Lin-Chu test, IPS test, several Fisher-type tests, Hadri's test), 1695- `cipstest` (cross-sectionally augmented IPS test), and 1696- `phansi` (Simes' test). 1697 1698While `purtest` implements various tests which can be selected via its `test` 1699argument, `cipstest` and `phansi` are functions for a specific test each. 1700 1701Function `purtest` offers the following tests by setting argument `test` to: 1702 1703- `"levinlin"` (default), for the Levin-Lin-Chu test 1704 (@LEVIN:LIN:CHU:02), see below for 1705 a theoretical exposition ([Levin-Lin-Chu test](#levinlin))), 1706- `"ips"`, for Im-Pesaran-Shin (IPS) test by @IM:PESAR:SHIN:03, see below for 1707 a theoretical exposition ([Im-Pesaran-Shin test](#ips))), 1708- `"madwu"`, is the inverse $\chi^2$ test by @MADDA:WU:99, also called 1709 P test by @CHOI:01, 1710- `"Pm"`, is the modified P test proposed by @CHOI:01 for large N, 1711- `"invnormal"`, is the inverse normal test (@CHOI:01), 1712- `"logit"`, is the logit test (@CHOI:01), 1713- `"hadri"`, for Hadri's test (@HADR:00). 1714 1715The tests in `purtest` are often called first generation panel unit root tests 1716as they do assume absence of cross-sectional correlation; all these, except 1717Hadri's test, are based on the estimation of 1718augmented Dickey-Fuller (ADF) regressions for each time series. A 1719statistic is then computed using the t-statistics associated with 1720the lagged variable. I a different manner, the Hadri residual-based LM statistic 1721is the cross-sectional average of individual KPSS statistics 1722(@KWIA:PHIL:SCHM:SHIN:92), standardized 1723by their asymptotic mean and standard deviation. 1724Among the tests in `purtest`, `"madwu"`, `"Pm"`, `"invormal"`, and `"logit"` are 1725Fisher-type tests.^[The individual p-values for the Fisher-type tests 1726are approximated as described in @MACK:96 if the package 1727`urca` (@PFAFF:08) is available, otherwise as described in @MACK:94.] 1728 1729`purtest` returns an object of class `"purtest"` which contains details 1730about the test performed, among them details about the individual 1731regressions/statistics for the test. Associated `summary` and `print.summary` 1732methods can be used to extract/display the additional information. 1733 1734Function `cipstest` implements Pesaran's (@pes07) cross-sectionally 1735augmented version of the Im-Pesaran-Shin panel unit root test and is a 1736so-called second-generation panel unit root test. 1737 1738Function `phansi` implements the idea of @HANCK:13 to apply Simes' testing 1739approach for intersection of individual hypothesis tests to panel unit root 1740testing, see below for a more thorough treatment 1741of [Simes’ approach for intersecting hypotheses](#phansi). 1742 1743 1744### Preliminary results 1745 1746We consider the following model: 1747 1748$$ 1749y_{it} = \delta y_{it-1} + \sum_{L=1}^{p_i} \theta_i \Delta 1750y_{it-L}+\alpha_{mi} d_{mt}+\epsilon_{it} 1751$$ 1752 1753The unit root hypothesis is $\rho = 1$. The model can be rewritten in 1754difference: 1755 1756$$ 1757\Delta y_{it} = \rho y_{it-1} + \sum_{L=1}^{p_i} \theta_i \Delta 1758y_{it-L}+\alpha_{mi} d_{mt}+\epsilon_{it} 1759$$ 1760 1761So that the unit-root hypothesis is now $\rho = 0$. 1762 1763Some of the unit-root tests for panel data are based on preliminary 1764results obtained by running the above Augmented Dickey-Fuller (ADF) 1765regression. 1766 1767First, we have to determine the optimal number of lags $p_i$ for each 1768time-series. Several possibilities are available. They all have in 1769common that the maximum number of lags have to be chosen first. Then, 1770$p_i$ can be chosen by using: 1771 1772- the Schwarz information criterion (SIC) (also known as Bayesian information criterion (BIC)), 1773- the Akaike information criterion (AIC), 1774- the Hall's method, which consist in removing the higher lags while 1775 they are not significant. 1776 1777The ADF regression is run on $T-p_i-1$ observations for 1778each individual, so that the total number of observations is $n\times 1779\tilde{T}$ where $\tilde{T}=T-p_i-1$ 1780 1781$\bar{p}$ is the average number of lags. Call $e_{i}$ the 1782vector of residuals. 1783 1784Estimate the variance of the $\epsilon_i$ as: 1785 1786$$ 1787\hat{\sigma}_{\epsilon_i}^2 = \frac{\sum_{t=p_i+1}^{T} e_{it}^2}{df_i} 1788$$ 1789 1790 1791 1792### Levin-Lin-Chu model{#levinlin} 1793 1794Then, as per @LEVIN:LIN:CHU:02, compute artificial regressions of $\Delta y_{it}$ 1795and $y_{it-1}$ on $\Delta y_{it-L}$ and $d_{mt}$ and get the two vectors of 1796residuals $z_{it}$ and $v_{it}$. 1797 1798Standardize these two residuals and run the pooled regression of 1799$z_{it}/\hat{\sigma}_i$ on $v_{it}/\hat{\sigma}_i$ to get 1800$\hat{\rho}$, its standard deviation $\hat{\sigma}({\hat{\rho}})$ and 1801the t-statistic $t_{\hat{\rho}}=\hat{\rho}/\hat{\sigma}({\hat{\rho}})$. 1802 1803Compute the long run variance of $y_i$ : 1804 1805$$ 1806\hat{\sigma}_{yi}^2 = \frac{1}{T-1}\sum_{t=2}^T \Delta y_{it}^2 + 2 1807\sum_{L=1}^{\bar{K}}w_{\bar{K}L}\left[\frac{1}{T-1}\sum_{t=2+L}^T 1808 \Delta y_{it} \Delta y_{it-L}\right] 1809$$ 1810 1811Define $\bar{s}_i$ as the ratio of the long and short term variance 1812and $\bar{s}$ the mean for all the individuals of the sample 1813 1814$$ 1815s_i = \frac{\hat{\sigma}_{yi}}{\hat{\sigma}_{\epsilon_i}} 1816$$ 1817 1818$$ 1819\bar{s} = \frac{\sum_{i=1}^n s_i}{n} 1820$$ 1821 1822 1823$$ 1824t^*_{\rho}=\frac{t_{\rho}- n \bar{T} \bar{s} \hat{\sigma}_{\tilde{\epsilon}}^{-2} 1825\hat{\sigma}({\hat{\rho}}) \mu^*_{m\tilde{T}}}{\sigma^*_{m\tilde{T}}} 1826$$ 1827 1828follows a normal distribution under the null hypothesis of 1829stationarity. $\mu^*_{m\tilde{T}}$ and $\sigma^*_{m\tilde{T}}$ are 1830given in table 2 of the original paper and are also available in the 1831package. 1832 1833An example how the Levin-Lin-Chu test is performed with `purtest` using a lag 1834of 2 and intercept and a time trend as exogenous variables in the ADF 1835regressions is: 1836```{r levinlin} 1837data("HousePricesUS", package = "pder") 1838lprice <- log(pdata.frame(HousePricesUS)$price) 1839(lev <- purtest(lprice, test = "levinlin", lags = 2, exo = "trend")) 1840summary(lev) ### gives details 1841``` 1842 1843 1844### Im-Pesaran-Shin (IPS) test{#ips} 1845 1846This test by @IM:PESAR:SHIN:03 does not require that $\rho$ is the same for all 1847the individuals. The null hypothesis is still that all the series have an 1848unit root, but the alternative is that some may have a unit root and 1849others have different values of $\rho_i <0$. 1850 1851The test is based on the average of the student statistic of the 1852$\rho$ obtained for each individual: 1853 1854$$ 1855\bar{t}=\frac{1}{n}\sum_{i=1}^n t_{\rho i} 1856$$ 1857 1858The statistic is then: 1859 1860$$ 1861z = \frac{\sqrt{n}\left(\bar{t}- E(\bar{t})\right)}{\sqrt{V(\bar{t})}} 1862$$ 1863 1864$\mu^*_{m\tilde{T}}$ and $\sigma^*_{m\tilde{T}}$ are 1865given in table 2 of the original paper and are also available in the 1866package. 1867 1868 1869An example of the IPS test with `purtest` with the same settings as in the 1870previously performed Levin-Lin-Chu test is: 1871```{r ips} 1872purtest(lprice, test = "ips", lags = 2, exo = "trend") 1873``` 1874 1875 1876 1877### Simes' approach: intersecting hypotheses{#phansi} 1878 1879A different approach to panel unit root testing can be drawn from the general 1880Simes' test for intersection of individual hypothesis tests [@SIMES:86]. 1881@HANCK:13 suggests to apply the approach for panel unit root testing: The tests 1882works by combining p-values from single hypothesis tests (individual 1883unit root tests) with a global (intersected) hypothesis and controls for the 1884multiplicity in testing. Thus, it works "on top" of any panel unit root test 1885which yield a p-value for each individual series. Unlike most other panel unit 1886root tests, this approach allows to discriminate between individuals for which 1887the individual H0 (unit root present for individual series) is rejected/is not 1888rejected and requires a pre-specified significance level. Further, the test is 1889robust versus general patterns of cross-sectional dependence. 1890 1891The function `phansi` for this test takes as main input object either a numeric 1892containing p-values of individual tests or a `"purtest"` object as produced by 1893function `purtest` which holds a suitable pre-computed panel unit root test 1894(one that produces p-values per individual series). The significance level is 1895set by argument `alpha` (default 5 %). The function's return value is a list 1896with detailed evaluation of the applied Simes test. The associated print method 1897gives a verbal evaluation. 1898 1899The following examples shows both accepted ways of input, the first example 1900replicates @HANCK:13, table 11 (left side), who applied some panel unit root test 1901for a Purchasing Power Parity analysis per country (individual H0 hypotheses per 1902series) to get the individual p-values and then used Simes' approach for testing 1903the global (intersecting) hypothesis for the whole panel. 1904```{r phansi1} 1905### input is numeric (p-values), replicates Hanck (2013), Table 11 (left side) 1906pvals <- c(0.0001,0.0001,0.0001,0.0001,0.0001,0.0001,0.0050,0.0050,0.0050, 1907 0.0050,0.0175,0.0175,0.0200,0.0250,0.0400,0.0500,0.0575,0.2375,0.2475) 1908countries <- c("Argentina","Sweden","Norway","Mexico","Italy","Finland","France", 1909 "Germany","Belgium","U.K.","Brazil","Australia","Netherlands", 1910 "Portugal","Canada", "Spain","Denmark","Switzerland","Japan") 1911names(pvals) <- countries 1912h <- phansi(pvals) 1913print(h) 1914h$rejected # logical indicating the individuals with rejected individual H0 1915``` 1916 1917```{r phansi2, results='hide'} 1918### input is a (suitable) purtest object / different example 1919y <- data.frame(split(Grunfeld$inv, Grunfeld$firm)) 1920obj <- purtest(y, pmax = 4, exo = "intercept", test = "madwu") 1921phansi(obj, alpha = 0.06) # test with significance level set to 6 % 1922``` 1923 1924 1925## Robust covariance matrix estimation{#robust} 1926 1927Robust estimators of the covariance matrix of coefficients are 1928provided, mostly for use in Wald-type tests, and this section provides 1929some basics and examples. A more comprehensive exposition of the theory 1930and the capabilities that come with the plm package is given in @mil17b. 1931 1932`vcovHC` estimates three "flavours" of White's heteroskedasticity-consistent 1933covariance matrix^[See @WHIT:80 and @WHIT:84b.] (known as the 1934*sandwich* estimator). Interestingly, in the 1935context of panel data the most general version also proves consistent 1936vs. serial correlation. 1937 1938All types assume no correlation between errors of different groups 1939while allowing for heteroskedasticity across groups, so that the full 1940covariance matrix of errors is $V=I_n \otimes \Omega_i; i=1,..,n$. As 1941for the *intragroup* error covariance matrix of every single 1942group of observations, `"white1"` allows for general 1943heteroskedasticity but no serial correlation, *i.e.* 1944 1945\begin{equation} 1946 (\#eq:omegaW1) 1947 \Omega_i= 1948 \left[ \begin{array}{c c c c} 1949 \sigma_{i1}^2 & \dots & \dots & 0 \\ 1950 0 & \sigma_{i2}^2 & & \vdots \\ 1951 \vdots & & \ddots & 0 \\ 1952 0 & ... & ... & \sigma_{iT}^2 \\ 1953 \end{array} \right] 1954\end{equation} 1955 1956while `"white2"` is `"white1"` restricted to a common 1957variance inside every group, estimated as 1958$\sigma_i^2=\sum_{t=1}^T{\hat{u}_{it}^2}/T$, so that $\Omega_i=I_T \otimes 1959\sigma_i^2$ (see @GREE:03, 13.7.1--2 and 1960@WOOL:02, 10.7.2; `"arellano"` (see ibid. and the original ref. @AREL:87) allows 1961a fully general structure w.r.t. heteroskedasticity and serial correlation: 1962 1963\begin{equation} 1964 (\#eq:omegaArellano) 1965 \Omega_i= 1966 \left[ \begin{array}{c c c c c} 1967 \sigma_{i1}^2 & \sigma_{i1,i2} & \dots & \dots & \sigma_{i1,iT} \\ 1968 \sigma_{i2,i1} & \sigma_{i2}^2 & & & \vdots \\ 1969 \vdots & & \ddots & & \vdots \\ 1970 \vdots & & & \sigma_{iT-1}^2 & \sigma_{iT-1,iT} \\ 1971 \sigma_{iT,i1} & \dots & \dots & \sigma_{iT,iT-1} & \sigma_{iT}^2 \\ 1972 \end{array} \right] 1973\end{equation} 1974 1975 1976The latter is, as already observed, consistent w.r.t. timewise 1977correlation of the errors, but on the converse, unlike the White 1 and 19782 methods, it relies on large $n$ asymptotics with small $T$. 1979 1980The fixed effects case, as already observed in Section 1981[tests of serial correlation](#serialcor) on serial correlation, is 1982complicated by the fact that the demeaning induces serial correlation 1983in the errors. The original White estimator (`"white1"`) turns out to be 1984inconsistent for fixed $T$ as $n$ grows, so in this case it is 1985advisable to use the `"arellano"` version (see @STOC:WATS:08). 1986 1987The errors may be weighted according to the schemes proposed by 1988@MACK:WHIT:85 and @CRIB:04 to improve small-sample performance^[The 1989HC3 and HC4 weighting schemes are computationally expensive and may 1990hit memory limits for $nT$ in the thousands, where on the other hand 1991it makes little sense to apply small sample corrections.]. 1992 1993The main use of `vcovHC` (and the other variance-covariance estimators provided 1994in the package `vcovBK`, `vcovNW`, `vcovDC`, `vcovSCC`) is to pass it to plm's 1995own functions like `summary`, `pwaldtest`, and `phtest` or together with testing 1996functions from the `lmtest` and `car` packages. All of these typically allow 1997passing the `vcov` or `vcov.` parameter either as a matrix or as a function 1998(see also @ZEIL:04). If one 1999is happy with the defaults, it is easiest to pass the function 2000itself^[For `coeftest` set `df = Inf` to have the coefficients' tests 2001be performed with standard normal distribution instead of t 2002distribution as we deal with a random effects model here. For these 2003types of models, the precise distribution of the coefficients 2004estimates is unknown.]: 2005 2006```{r vcovHC1} 2007re <- plm(inv~value+capital, data = Grunfeld, model = "random") 2008summary(re, vcov = vcovHC) # gives usual summary output but with robust test statistics 2009 2010library("lmtest") 2011coeftest(re, vcovHC, df = Inf) 2012``` 2013 2014 2015else one may do the covariance computation inside the call, thus passing on a matrix: 2016 2017```{r vcovHC2, results='hide'} 2018summary(re, vcov = vcovHC(re, method="white2", type="HC3"), df = Inf) 2019coeftest(re, vcovHC(re, method="white2", type="HC3"), df = Inf) 2020``` 2021 2022For some tests, e.g., for multiple model comparisons by `waldtest`, one 2023should always provide a function^[Joint zero-restriction testing still 2024allows providing the `vcov` of the unrestricted model as a matrix, see 2025the documentation of package `lmtest`.]. In this case, optional 2026parameters are provided as shown below (see also @ZEIL:04, p. 12): 2027 2028```{r waldtest-vcovHC} 2029waldtest(re, update(re, . ~ . -capital), 2030 vcov=function(x) vcovHC(x, method="white2", type="HC3")) 2031``` 2032 2033Moreover, `linearHypothesis` from package `car` may be used to test 2034for linear restrictions: 2035 2036```{r car-vcovHC} 2037library("car") 2038linearHypothesis(re, "2*value=capital", vcov. = vcovHC) 2039``` 2040 2041A specific methods are also provided for `pcce` and `pgmm` objects, for the latter 2042`vcovHC` provides the robust covariance matrix proposed by 2043@WIND:05 for generalized method of moments estimators. 2044 2045 2046# plm versus nlme and lme4{#nlme} 2047 2048The models termed *panel* by the econometricians have counterparts in 2049the statistics literature on *mixed* models (or *hierarchical models*, 2050or *models for longitudinal data*), although there are both 2051differences in jargon and more substantial distinctions. This language 2052inconsistency between the two communities, together with the more 2053complicated general structure of statistical models for longitudinal 2054data and the associated notation in the software, is likely to scare 2055some practicing econometricians away from some potentially useful 2056features of the `R` environment, so it may be useful to provide here a 2057brief reconciliation between the typical panel data specifications 2058used in econometrics and the general framework used in statistics for 2059mixed models^[This discussion does not consider GMM models. One of the 2060basic reasons for econometricians not to choose maximum likelihood 2061methods in estimation is that the strict exogeneity of regressors 2062assumption required for consistency of the ML models reported in the 2063following is often inappropriate in economic settings.]. 2064 2065`R` is particularly strong on mixed models' estimation, thanks to the 2066long-standing `nlme` package (see @PINH:BATE:DEBR:SARK:07) and the 2067more recent `lme4` package, based on S4 classes (see @BATE:07)^[The 2068standard reference on the subject of mixed models in `S`/`R` is 2069@PINH:BATE:00.]. In the following we will refer to the more 2070established `nlme` to give some examples of "econometric" panel models 2071that can be estimated in a likelihood framework, also including some 2072likelihood ratio tests. Some of them are not feasible in `plm` and 2073make a useful complement to the econometric "toolbox" available in 2074`R`. 2075 2076## Fundamental differences between the two approaches 2077 2078Econometrics deal mostly with non-experimental data. Great emphasis 2079is put on specification procedures and misspecification testing. Model 2080specifications tend therefore to be very simple, while great attention 2081is put on the issues of endogeneity of the regressors, dependence 2082structures in the errors and robustness of the estimators under 2083deviations from normality. The preferred approach is often semi- or 2084non-parametric, and heteroskedasticity-consistent techniques are 2085becoming standard practice both in estimation and testing. 2086 2087For all these reasons, although the maximum likelihood framework is 2088important in testing^[Lagrange Multiplier tests based on the 2089likelihood principle are suitable for testing against more general 2090alternatives on the basis of a maintained model with spherical 2091residuals and find therefore application in testing for departures 2092from the classical hypotheses on the error term. The seminal reference 2093is @BREU:PAGA:80.] and sometimes used in estimation as well, panel 2094model estimation in econometrics is mostly accomplished in the 2095generalized least squares framework based on Aitken's Theorem and, 2096when possible, in its special case OLS, which are free from 2097distributional assumptions (although these kick in at the diagnostic 2098testing stage). On the contrary, longitudinal data models in `nlme` 2099and `lme4` are estimated by (restricted or unrestricted) maximum 2100likelihood. While under normality, homoskedasticity and no serial 2101correlation of the errors OLS are also the maximum likelihood 2102estimator, in all the other cases there are important differences. 2103 2104The econometric GLS approach has closed-form analytical solutions 2105computable by standard linear algebra and, although the latter can 2106sometimes get computationally heavy on the machine, the expressions 2107for the estimators are usually rather simple. ML estimation of 2108longitudinal models, on the contrary, is based on numerical 2109optimization of nonlinear functions without closed-form solutions and 2110is thus dependent on approximations and convergence criteria. For 2111example, the "GLS" functionality in `nlme` is rather different 2112from its "econometric" counterpart. "Feasible GLS" estimation in 2113`plm` is based on a single two-step procedure, in which an 2114inefficient but consistent estimation method (typically OLS) is 2115employed first in order to get a consistent estimate of the errors' 2116covariance matrix, to be used in GLS at the second step; on the 2117converse, "GLS" estimators in `nlme` are based on iteration 2118until convergence of two-step optimization of the relevant likelihood. 2119 2120## Some false friends 2121 2122The *fixed/random effects* terminology in econometrics is often 2123recognized to be misleading, as both are treated as random variates in 2124modern econometrics (see, e.g., @WOOL:02 10.2.1). It has been 2125recognized since Mundlak's classic paper (@MUND:78) that the 2126fundamental issue is whether the unobserved effects are correlated 2127with the regressors or not. In this last case, they can safely be left 2128in the error term, and the serial correlation they induce is cared for 2129by means of appropriate GLS transformations. On the contrary, in the 2130case of correlation, "fixed effects" methods such as least squares 2131dummy variables or time-demeaning are needed, which explicitly, 2132although inconsistently^[For fixed effects estimation, as the sample 2133grows (on the dimension on which the fixed effects are specified) so 2134does the number of parameters to be estimated. Estimation of 2135individual fixed effects is $T$-- (but not $n$--) consistent, and the 2136opposite.], estimate a group-- (or time--) invariant additional 2137parameter for each group (or time period). 2138 2139Thus, from the point of view of model specification, having 2140*fixed effects* in an econometric model has the meaning of allowing the 2141intercept to vary with group, or time, or both, while the other 2142parameters are generally still assumed to be homogeneous. Having 2143*random effects* means having a group-- (or time--, or both) specific 2144component in the error term. 2145 2146In the mixed models literature, on the contrary, *fixed effect* 2147indicates a parameter that is assumed constant, while 2148*random effects* are parameters that vary randomly around zero 2149according to a joint multivariate normal distribution. 2150 2151<!--- 2152 So, the FE model in econometrics has no counterpart in the mixed 2153 models framework, unless reducing it to OLS on a specification with 2154 one dummy for each group (viz. time period, or both) (often termed 2155 *least squares dummy variables*, or LSDV model) which can 2156 trivially be estimated by OLS. 2157--> 2158 2159So, the FE model in econometrics has no counterpart in the 2160mixed models framework, unless reducing it to OLS on a 2161specification with one dummy for each group (often termed 2162*least squares dummy variables*, or LSDV model) which can 2163trivially be estimated by OLS. The RE model is 2164instead a special case of a mixed model where only the intercept is 2165specified as a random effect, while the "random" type variable 2166coefficients model can be seen as one that has the same regressors in 2167the fixed and random sets. The unrestricted generalized least squares 2168can in turn be seen, in the `nlme` framework, as a standard linear 2169model with a general error covariance structure within the groups and 2170errors uncorrelated across groups. 2171 2172 2173 2174## A common taxonomy 2175 2176To reconcile the two terminologies, in the following we report the 2177specification of the panel models in `plm` according to the general 2178expression of a mixed model in Laird-Ware form [see the web appendix 2179to @FOX:02] and the `nlme` estimation commands for maximum likelihood 2180estimation of an equivalent specification^[In doing so, we stress that 2181"equivalence" concerns only the specification of the model, and 2182neither the appropriateness nor the relative efficiency of the 2183relevant estimation techniques, which will of course be dependent on 2184the context. Unlike their mixed model counterparts, the specifications 2185in `plm` are, strictly speaking, distribution-free. Nevertheless, for 2186the sake of exposition, in the following we present them in the 2187setting which ensures consistency and efficiency (e.g., we consider 2188the hypothesis of spherical errors part of the specification of pooled 2189OLS and so forth).]. 2190 2191### The Laird-Ware representation for mixed models 2192 2193A general representation for the linear mixed effects model is given 2194in @LAIR:WARE:82. 2195 2196$$ 2197\begin{array}{rcl} 2198y_{it} & = & \beta_1 x_{1ij} + \dots + \beta_p x_{pij} \\ 2199 & & b_1 z_{1ij} + \dots + b_p z_{pij} + \epsilon_{ij} \\ 2200b_{ik} & \sim & N(0,\psi^2_k), \phantom{p} Cov(b_k,b_{k'}) = \psi_{kk'} \\ 2201\epsilon_{ij} & \sim & N(0,\sigma^2 \lambda_{ijj}), \phantom{p} Cov(\epsilon_{ij},\epsilon_{ij'}) = \sigma^2 \lambda_{ijj'} \\ 2202\end{array} 2203$$ 2204 2205where the $x_1, \dots x_p$ are the fixed effects regressors and the 2206$z_1, \dots z_p$ are the random effects regressors, assumed to be 2207 normally distributed across groups. The covariance of 2208the random effects coefficients $\psi_{kk'}$ is assumed constant 2209across groups and the covariances between the errors in group $i$, 2210$\sigma^2 \lambda_{ijj'}$, are described by the term $\lambda_{ijj'}$ 2211representing the correlation structure of the errors within each group 2212(e.g., serial correlation over time) scaled by the common error 2213variance $\sigma^2$. 2214 2215### Pooling and Within 2216 2217The *pooling* specification in `plm` is equivalent to a classical 2218linear model (i.e., no random effects regressor and spherical errors: 2219$b_{iq}=0 \phantom{p} \forall i,q, \phantom{p} \lambda_{ijj}=\sigma^2$ 2220for $j=j'$, $0$ else). The *within* one is the same with the 2221regressors' set augmented by $n-1$ group dummies. There is no point in 2222using `nlme` as parameters can be estimated by OLS which is also ML. 2223 2224### Random effects 2225 2226In the Laird and Ware notation, the RE specification is a model with 2227only one random effects regressor: the intercept. Formally, $z_{1ij}=1 2228\phantom{p}\forall i,j, \phantom{p} z_{qij}=0 \phantom{p} \forall i, 2229\forall j, \forall q \neq 1$ $\lambda_{ij}=1$ for $i=j$, $0$ 2230else). The composite error is therefore $u_{ij}=1b_{i1} + 2231\epsilon_{ij}$. Below we report coefficients of Grunfeld's model 2232estimated by GLS and then by ML: 2233 2234```{r re2} 2235library(nlme) 2236reGLS <- plm(inv~value+capital, data=Grunfeld, model="random") 2237 2238reML <- lme(inv~value+capital, data=Grunfeld, random=~1|firm) 2239 2240coef(reGLS) 2241 2242summary(reML)$coefficients$fixed 2243``` 2244 2245### Variable coefficients, "random" 2246 2247Swamy's variable coefficients model [@SWAM:70] has coefficients 2248varying randomly (and independently of each other) around a set of 2249fixed values, so the equivalent specification is $z_{q}=x_{q} 2250\phantom{p} \forall q$, i.e. the fixed effects and the random effects 2251regressors are the same, and $\psi_{kk'}=\sigma_\mu^2 I_N$, and 2252$\lambda_{ijj}=1$, $\lambda_{ijj'}=0$ for $j \neq j'$, that's to say 2253they are not correlated. 2254 2255Estimation of a mixed model with random coefficients on all regressors 2256is rather demanding from the computational side. Some models from our 2257examples fail to converge. The below example is estimated on the 2258Grunfeld data and model with time effects. 2259 2260```{r vcmrand} 2261vcm <- pvcm(inv~value+capital, data=Grunfeld, model="random", effect="time") 2262 2263vcmML <- lme(inv~value+capital, data=Grunfeld, random=~value+capital|year) 2264 2265coef(vcm) 2266 2267summary(vcmML)$coefficients$fixed 2268``` 2269 2270### Variable coefficients, "within" 2271 2272This specification actually entails separate estimation of $T$ 2273different standard linear models, one for each group in the data, so 2274the estimation approach is the same: OLS. In `nlme` this is done by 2275creating an `lmList` object, so that the two models below are 2276equivalent (output suppressed): 2277 2278 2279```{r vcmfixed} 2280vcmf <- pvcm(inv~value+capital, data=Grunfeld, model="within", effect="time") 2281 2282vcmfML <- lmList(inv~value+capital|year, data=Grunfeld) 2283``` 2284 2285### General FGLS 2286 2287The general, or unrestricted, feasible GLS (FGLS), `pggls` in the 2288`plm` nomenclature, is equivalent to a model with no random effects 2289regressors ($b_{iq}=0 \phantom{p} \forall i,q$) and an error 2290covariance structure which is unrestricted within groups apart from 2291the usual requirements. The function for estimating such models with 2292correlation in the errors but no random effects is `gls()`. 2293 2294This very general serial correlation and heteroskedasticity structure 2295is not estimable for the original Grunfeld data, which have more time 2296periods than firms, therefore we restrict them to firms 4 to 6. 2297 2298```{r gglsre} 2299sGrunfeld <- Grunfeld[Grunfeld$firm %in% 4:6, ] 2300 2301ggls <- pggls(inv~value+capital, data=sGrunfeld, model="pooling") 2302 2303gglsML <- gls(inv~value+capital, data=sGrunfeld, 2304 correlation=corSymm(form=~1|year)) 2305 2306coef(ggls) 2307 2308summary(gglsML)$coefficients 2309``` 2310 2311The *within* case is analogous, with the regressor set augmented 2312by $n-1$ group dummies. 2313 2314## Some useful "econometric" models in nlme 2315Finally, amongst the many possible specifications estimable with 2316`nlme`, we report a couple cases that might be especially 2317interesting to applied econometricians. 2318 2319### AR(1) pooling or random effects panel 2320 2321Linear models with groupwise structures of time-dependence^[Take heed 2322that here, in contrast to the usual meaning of serial correlation in 2323time series, we always speak of serial correlation *between the errors 2324of each group*.] may be fitted by `gls()`, specifying the correlation 2325structure in the `correlation` option^[note that the time index is 2326coerced to numeric before the estimation.]: 2327 2328 2329```{r lmAR1} 2330Grunfeld$year <- as.numeric(as.character(Grunfeld$year)) 2331lmAR1ML <- gls(inv~value+capital,data=Grunfeld, 2332 correlation=corAR1(0,form=~year|firm)) 2333``` 2334 2335and analogously the random effects panel with, e.g., AR(1) errors 2336(see @BALT:05; @BALT:13; @BALT:21, ch. 5), which is a very common specification 2337in econometrics, may be fit by `lme` specifying an additional 2338random intercept: 2339 2340```{r reAR1} 2341reAR1ML <- lme(inv~value+capital, data=Grunfeld,random=~1|firm, 2342 correlation=corAR1(0,form=~year|firm)) 2343``` 2344 2345The regressors' coefficients and the error's serial correlation 2346coefficient may be retrieved this way: 2347 2348```{r fetchcoefs} 2349summary(reAR1ML)$coefficients$fixed 2350coef(reAR1ML$modelStruct$corStruct, unconstrained=FALSE) 2351``` 2352 2353Significance statistics for the regressors' coefficients are to be 2354found in the usual `summary` object, while to get the significance 2355test of the serial correlation coefficient one can do a likelihood 2356ratio test as shown in the following. 2357 2358### An LR test for serial correlation and one for random effects 2359 2360A likelihood ratio test for serial correlation in the idiosyncratic 2361residuals can be done as a nested models test, by `anova()`, 2362comparing the model with spherical idiosyncratic residuals with the 2363more general alternative featuring AR(1) residuals. The test takes the 2364form of a zero restriction test on the autoregressive parameter. 2365 2366This can be done on pooled or random effects models alike. First we 2367report the simpler case. 2368 2369We already estimated the pooling AR(1) model above. The GLS model 2370without correlation in the residuals is the same as OLS, and one could 2371well use `lm()` for the restricted model. Here we estimate it 2372by `gls()`. 2373 2374 2375```{r LRar} 2376lmML <- gls(inv~value+capital, data=Grunfeld) 2377anova(lmML, lmAR1ML) 2378``` 2379 2380 2381The AR(1) test on the random effects model is to be done in much the 2382same way, using the random effects model objects estimated above: 2383 2384```{r LRarsubRE} 2385anova(reML, reAR1ML) 2386``` 2387 2388A likelihood ratio test for random effects compares the specifications 2389with and without random effects and spherical idiosyncratic errors: 2390 2391 2392```{r LRre} 2393anova(lmML, reML) 2394``` 2395 2396 2397The random effects, AR(1) errors model in turn nests the AR(1) pooling 2398model, therefore a likelihood ratio test for random effects sub AR(1) 2399errors may be carried out, again, by comparing the two autoregressive 2400specifications: 2401 2402```{r LRresubAR} 2403anova(lmAR1ML, reAR1ML) 2404``` 2405 2406whence we see that the Grunfeld model specification doesn't seem to need any 2407random effects once we control for serial correlation in the data. 2408 2409 2410# Conclusions{#conclusions} 2411 2412With `plm` we aim at providing a comprehensive package containing the 2413standard functionalities that are needed for the management and the 2414econometric analysis of panel data. In particular, we provide: 2415functions for data transformation; estimators for pooled, random and 2416fixed effects static panel models and variable coefficients models, 2417general GLS for general covariance structures, and generalized method 2418of moments estimators for dynamic panels; specification and diagnostic 2419tests. Instrumental variables estimation is supported. Most estimators 2420allow working with unbalanced panels. While among the different 2421approaches to longitudinal data analysis we take the perspective of 2422the econometrician, the syntax is consistent with the basic linear 2423modeling tools, like the `lm` function. 2424 2425On the input side, `formula` and `data` arguments are 2426used to specify the model to be estimated. Special functions are 2427provided to make writing formulas easier, and the structure of the 2428data is indicated with an `index` argument. 2429 2430On the output side, the model objects (of the new class 2431`panelmodel`) are compatible with the general restriction 2432testing frameworks of packages `lmtest` and `car`. 2433Specialized methods are also provided for the calculation of robust 2434covariance matrices; heteroskedasticity- and correlation-consistent 2435testing is accomplished by passing these on to testing functions, 2436together with a `panelmodel` object. 2437 2438The main functionalities of the package have been illustrated here by 2439applying them on some well-known data sets from the econometric 2440literature. The similarities and differences with the maximum 2441likelihood approach to longitudinal data have also been briefly 2442discussed. 2443 2444<!-- 2445We plan to expand the methods in this paper to systems of equations 2446and to the estimation and testing of models with autoregressive 2447errors. Additions to the misspecification testing toolbox and 2448covariance estimators robust vs. cross-sectional correlation are also 2449in the offing. Lastly, conditional visualization features in the 2450`R` environment seem to offer a promising toolbox for visual 2451diagnostics, which is another subject for future work. 2452--> 2453 2454We plan to expand the methods in this paper to systems of equations 2455and to the estimation of models with autoregressive errors. Addition 2456of covariance estimators robust vs. cross-sectional correlation are 2457also in the offing. Lastly, conditional visualization features in the 2458`R` environment seem to offer a promising toolbox for visual 2459diagnostics, which is another subject for future work. 2460 2461 2462# Acknowledgments {-} 2463 2464While retaining responsibility for any error, we thank Jeffrey 2465Wooldridge, Achim Zeileis and three anonymous referees for useful 2466comments. We also acknowledge kind editing assistance by Lisa 2467Benedetti. 2468 2469# Bibliography {-} 2470 2471 <!-- Local IspellDict: english --> <!-- Local IspellPersDict: ~/emacs/.ispell-english --> 2472 2473