1\chapter{Dynamic panel models} 2\label{chap:dpanel} 3 4\newcommand{\by}{\mathbf{y}} 5\newcommand{\bx}{\mathbf{x}} 6\newcommand{\bv}{\mathbf{v}} 7\newcommand{\bX}{\mathbf{X}} 8\newcommand{\bW}{\mathbf{W}} 9\newcommand{\bZ}{\mathbf{Z}} 10\newcommand{\bA}{\mathbf{A}} 11\newcommand{\bM}{\mathbf{M}} 12\newcommand{\biota}{\bm{\iota}} 13 14\newenvironment% 15{altcode}% 16{\vspace{1ex}\small\leftmargin 1em}{\vspace{1ex}} 17 18The command for estimating dynamic panel models in gretl is 19\texttt{dpanel}. This command supports both the ``difference'' 20estimator \citep{arellano-bond91} and the ``system'' estimator 21\citep{blundell-bond98}, which has become the method of choice in the 22applied literature. 23 24\section{Introduction} 25\label{sec:dpanel-intro} 26 27\subsection{Notation} 28\label{sec:dpanel-notation} 29 30A dynamic linear panel data model can be represented as follows 31(in notation based on \cite{arellano03}): 32\begin{equation} 33 \label{eq:dpd-def} 34 y_{it} = \alpha y_{i,t-1} + \beta'x_{it} + \eta_{i} + v_{it} 35\end{equation} 36where $i=1,2\ldots,N$ indexes the cross-section units and $t$ indexes 37time. 38 39The main idea behind the difference estimator is to sweep out the 40individual effect via differencing. First-differencing eq.\ 41(\ref{eq:dpd-def}) yields 42\begin{equation} 43 \label{eq:dpd-dif} 44 \Delta y_{it} = \alpha \Delta y_{i,t-1} + \beta'\Delta x_{it} + 45 \Delta v_{it} = \gamma' W_{it} + \Delta v_{it} , 46\end{equation} 47in obvious notation. The error term of (\ref{eq:dpd-dif}) is, by 48construction, autocorrelated and also correlated with the lagged 49dependent variable, so an estimator that takes both issues into 50account is needed. The endogeneity issue is solved by noting that all 51values of $y_{i,t-k}$ with $k>1$ can be used as instruments for 52$\Delta y_{i,t-1}$: unobserved values of $y_{i,t-k}$ (whether missing 53or pre-sample) can safely be substituted with 0. In the language of 54GMM, this amounts to using the relation 55\begin{equation} 56 \label{eq:OC-dif} 57 E(\Delta v_{it} \cdot y_{i,t-k}) = 0, \quad k>1 58\end{equation} 59as an orthogonality condition. 60 61Autocorrelation is dealt with by noting that if $v_{it}$ is white 62noise, the covariance matrix of the vector whose typical element is 63$\Delta v_{it}$ is proportional to a matrix $H$ that has 2 on the main 64diagonal, $-1$ on the first subdiagonals and 0 elsewhere. One-step 65GMM estimation of equation (\ref{eq:dpd-dif}) amounts to computing 66\begin{equation} 67\label{eq:dif-gmm} 68 \hat{\gamma} = \left[ 69 \left( \sum_i \bW_i'\bZ_i \right) \bA_N 70 \left( \sum_i \bZ_i'\bW_i \right) \right]^{-1} 71 \left( \sum_i \bW_i'\bZ_i \right) \bA_N 72 \left( \sum_i \bZ_i'\Delta \by_i \right) 73\end{equation} 74where 75\begin{align*} 76 \Delta \by_i & = 77 \left[ \begin{array}{ccc} 78 \Delta y_{i,3} & \cdots & \Delta y_{i,T} 79 \end{array} \right]' \\ 80 \bW_i & = 81 \left[ \begin{array}{ccc} 82 \Delta y_{i,2} & \cdots & \Delta y_{i,T-1} \\ 83 \Delta x_{i,3} & \cdots & \Delta x_{i,T} \\ 84 \end{array} \right]' \\ 85 \bZ_i & = 86 \left[ \begin{array}{ccccccc} 87 y_{i1} & 0 & 0 & \cdots & 0 & \Delta x_{i3}\\ 88 0 & y_{i1} & y_{i2} & \cdots & 0 & \Delta x_{i4}\\ 89 & & \vdots \\ 90 0 & 0 & 0 & \cdots & y_{i, T-2} & \Delta x_{iT} \\ 91 \end{array} \right]' \\ 92 \intertext{and} 93 \bA_N & = \left( \sum_i \bZ_i' H \bZ_i \right)^{-1} 94\end{align*} 95 96Once the 1-step estimator is computed, the sample covariance matrix of 97the estimated residuals can be used instead of $H$ to obtain 2-step 98estimates, which are not only consistent but asymptotically 99efficient. (In principle the process may be iterated, but nobody seems 100to be interested.) Standard GMM theory applies, except for one thing: 101\cite{Windmeijer05} has computed finite-sample corrections to the 102asymptotic covariance matrix of the parameters, which are nowadays 103almost universally used. 104 105The difference estimator is consistent, but has been shown to have 106poor properties in finite samples when $\alpha$ is near one. People 107these days prefer the so-called ``system'' estimator, which 108complements the differenced data (with lagged levels used as 109instruments) with data in levels (using lagged differences as 110instruments). The system estimator relies on an extra orthogonality 111condition which has to do with the earliest value of the dependent 112variable $y_{i,1}$. The interested reader is referred to \citet[pp.\ 113124--125]{blundell-bond98} for details, but here it suffices to say 114that this condition is satisfied in mean-stationary models and brings 115an improvement in efficiency that may be substantial in many cases. 116 117The set of orthogonality conditions exploited in the system approach 118is not very much larger than with the difference estimator since most 119of the possible orthogonality conditions associated with the equations 120in levels are redundant, given those already used for the equations in 121differences. 122 123The key equations of the system estimator can be written as 124 125\begin{equation} 126\label{eq:sys-gmm} 127 \tilde{\gamma} = \left[ 128 \left( \sum_i \tilde{\bW}_i'\tilde{\bZ}_i \right) \bA_N 129 \left( \sum_i \tilde{\bZ}_i'\tilde{\bW}_i \right) \right]^{-1} 130 \left( \sum_i \tilde{\bW}_i'\tilde{\bZ}_i \right) \bA_N 131 \left( \sum_i \tilde{\bZ}_i'\Delta \tilde{\by}_i \right) 132\end{equation} 133where 134\begin{align*} 135 \Delta \tilde{\by}_i & = 136 \left[ \begin{array}{ccccccc} 137 \Delta y_{i3} & \cdots & \Delta y_{iT} & y_{i3} & \cdots & y_{iT} 138 \end{array} \right]' \\ 139 \tilde{\bW}_i & = 140 \left[ \begin{array}{cccccc} 141 \Delta y_{i2} & \cdots & \Delta y_{i,T-1} & y_{i2} & \cdots & y_{i,T-1} \\ 142 \Delta x_{i3} & \cdots & \Delta x_{iT} & x_{i3} & \cdots & x_{iT} \\ 143 \end{array} \right]' \\ 144 \tilde{\bZ}_i & = 145 \left[ \begin{array}{ccccccccc} 146 y_{i1} & 0 & 0 & \cdots & 0 & 0 & \cdots & 0 & \Delta x_{i,3}\\ 147 0 & y_{i1} & y_{i2} & \cdots & 0 & 0 & \cdots & 0 & \Delta x_{i,4}\\ 148 & & \vdots \\ 149 0 & 0 & 0 & \cdots & y_{i, T-2} & 0 & \cdots & 0 & \Delta x_{iT}\\ 150 & & \vdots \\ 151 0 & 0 & 0 & \cdots & 0 & \Delta y_{i2} & \cdots & 0 & x_{i3}\\ 152 & & \vdots \\ 153 0 & 0 & 0 & \cdots & 0 & 0 & \cdots & \Delta y_{i,T-1} & x_{iT}\\ 154 \end{array} \right]' \\ 155 \intertext{and} 156 \bA_N & = \left( \sum_i \tilde{\bZ}_i' H^* \tilde{\bZ}_i \right)^{-1} 157\end{align*} 158 159In this case choosing a precise form for the matrix $H^*$ for the 160first step is no trivial matter. Its north-west block should be as 161similar as possible to the covariance matrix of the vector $\Delta 162v_{it}$, so the same choice as the ``difference'' estimator is 163appropriate. Ideally, the south-east block should be proportional to 164the covariance matrix of the vector $\biota \eta_i + \bv$, that is 165$\sigma^2_{v} I + \sigma^2_{\eta} \biota \biota'$; but since 166$\sigma^2_{\eta}$ is unknown and any positive definite matrix renders 167the estimator consistent, people just use $I$. The off-diagonal blocks 168should, in principle, contain the covariances between $\Delta v_{is}$ 169and $v_{it}$, which would be an identity matrix if $v_{it}$ is white 170noise. However, since the south-east block is typically given a 171conventional value anyway, the benefit in making this choice is not 172obvious. Some packages use $I$; others use a zero matrix. 173Asymptotically, it should not matter, but on real datasets the 174difference between the resulting estimates can be noticeable. 175 176\subsection{Rank deficiency} 177\label{sec:rankdef} 178 179Both the difference estimator (\ref{eq:dif-gmm}) and the system 180estimator (\ref{eq:sys-gmm}) depend for their existence on the 181invertibility of $\bA_N$. This matrix may turn out to be singular for 182several reasons. However, this does not mean that the estimator is not 183computable: in some cases, adjustments are possible such that the 184estimator does exist, but the user should be aware that in these cases 185not all software packages use the same strategy and replication of 186results may prove difficult or even impossible. 187 188A first reason why $\bA_N$ may be singular could be the unavailability 189of instruments, chiefly because of missing observations. This case is 190easy to handle. If a particular row of $\tilde{\bZ}_i$ is zero for all 191units, the corresponding orthogonality condition (or the corresponding 192instrument if you prefer) is automatically dropped; of course, the 193overidentification rank is adjusted for testing purposes. 194 195Even if no instruments are zero, however, $\bA_N$ could be rank 196deficient. A trivial case occurs if there are collinear instruments, 197but a less trivial case may arise when $T$ (the total number of time 198periods available) is not much smaller than $N$ (the number of units), 199as, for example, in some macro datasets where the units are 200countries. The total number of potentially usable orthogonality 201conditions is $O(T^2)$, which may well exceed $N$ in some cases. Of 202course $\bA_N$ is the sum of $N$ matrices which have, at most, rank $2T - 2033$ and therefore it could well happen that the sum is singular. 204 205In all these cases, what we consider the ``proper'' way to go is to 206substitute the pseudo-inverse of $\bA_N$ (Moore--Penrose) for its regular 207inverse. Again, our choice is shared by some software packages, but 208not all, so replication may be hard. 209 210\subsection{Covariance matrix and standard errors} 211 212By default the standard errors shown by \texttt{dpanel} for 1-step 213estimation are robust, based on the heteroskedasticity-consistent 214variance estimator 215\[ 216 \widehat{\rm Var}(\hat{\gamma}) = 217 \bM^{-1} \left(\sum_i\bW_i'\bZ_i\right) 218 \bA_N\hat{\mathbf{V}}_N\bA_N 219 \left(\sum_i\bZ_i'\bW_i\right) \bM^{-1} 220 \] 221 where $\bM = (\sum_i\bW_i'\bZ_i) \bA_N (\sum_i\bZ_i'\bW_i)$ and 222 $\hat{\mathbf{V}}_N = N^{-1} \sum_i 223 \bZ_i'\hat{\mathbf{u}}_i\hat{\mathbf{u}}_i'\bZ_i$, with 224 $\hat{\mathbf{u}}_i$ the vector of residuals in differences for 225 individual $i$. In addition, as noted above, the variance estimator 226 for 2-step estimation employs the finite-sample correction of 227 \cite{Windmeijer05}. 228 229 When the \verb|--asymptotic| option is passed to \texttt{dpanel}, 230 however, the 1-step variance estimator is simply 231 $\hat{\sigma}_u^2 M^{-1}$, which is not 232 heteroskedasticity-consistent, and the Windmeijer correction is not 233 applied for 2-step estimation. Use of the asymptotic option is not 234 recommended unless you wish to replicate prior results that did not 235 report robust standard errors. In particular, tests based on the 236 asymptotic 2-step variance estimator are known to over-reject quite 237 substantially (standard errors too small). 238 239\subsection{Treatment of missing values} 240 241Textbooks seldom bother with missing values, but in some cases their 242treatment may be far from obvious. This is especially true if missing 243values are interspersed between valid observations. For example, 244consider the plain difference estimator with one lag, so 245\[ 246y_t = \alpha y_{t-1} + \eta + \epsilon_t 247\] 248where the $i$ index is omitted for clarity. Suppose you have an 249individual with $t=1\ldots5$, for which $y_3$ is missing. It may seem 250that the data for this individual are unusable, because 251differencing $y_t$ would produce something like 252\[ 253\begin{array}{c|ccccc} 254 t & 1 & 2 & 3 & 4 & 5 \\ 255 \hline 256 y_t & * & * & \circ & * & * \\ 257 \Delta y_t & \circ & * & \circ & \circ & * 258\end{array} 259\] 260where $*$ = nonmissing and $\circ$ = missing. Estimation seems to be 261unfeasible, since there are no periods in which $\Delta y_t$ and 262$\Delta y_{t-1}$ are both observable. 263 264However, we can use a $k$-difference operator and get 265\[ 266\Delta_k y_t = \alpha \Delta_k y_{t-1} + \Delta_k \epsilon_t 267\] 268where $\Delta_k = 1 - L^k$ and past levels of $y_t$ are perfectly 269valid instruments. In this example, we can choose $k=3$ and use $y_1$ 270as an instrument, so this unit is in fact perfectly usable. 271 272Not all software packages seem to be aware of this possibility, so 273replicating published results may prove tricky if your dataset 274contains individuals with gaps between valid observations. 275 276\section{Usage} 277\label{sec:dpanel-usage} 278 279One of the concepts underlying the syntax of \texttt{dpanel} is that 280you get default values for several choices you may want to make, so 281that in a ``standard'' situation the command is very concise. The 282simplest case of the model (\ref{eq:dpd-def}) is a plain AR(1) 283process: 284\begin{equation} 285\label{eq:dp1} 286 y_{i,t} = \alpha y_{i,t-1} + \eta_{i} + v_{it} . 287\end{equation} 288If you give the command 289\begin{code} 290 dpanel 1 ; y 291\end{code} 292gretl assumes that you want to estimate (\ref{eq:dp1}) via the 293difference estimator (\ref{eq:dif-gmm}), using as many orthogonality 294conditions as possible. The scalar \texttt{1} between \texttt{dpanel} 295and the semicolon indicates that only one lag of \texttt{y} is 296included as an explanatory variable; using \texttt{2} would give an 297AR(2) model. The syntax that gretl uses for the non-seasonal AR and MA 298lags in an ARMA model is also supported in this context. For 299example, if you want the first and third lags of \texttt{y} (but not 300the second) included as explanatory variables you can say 301\begin{code} 302 dpanel {1 3} ; y 303\end{code} 304or you can use a pre-defined matrix for this purpose: 305\begin{code} 306 matrix ylags = {1, 3} 307 dpanel ylags ; y 308\end{code} 309To use a single lag of \texttt{y} other than the first you need to 310employ this mechanism: 311\begin{code} 312 dpanel {3} ; y # only lag 3 is included 313 dpanel 3 ; y # compare: lags 1, 2 and 3 are used 314\end{code} 315 316To use the system estimator instead, you add the \verb|--system| 317option, as in 318\begin{code} 319 dpanel 1 ; y --system 320\end{code} 321The level orthogonality conditions and the corresponding instrument 322are appended automatically (see eq.\ \ref{eq:sys-gmm}). 323 324\subsection{Regressors} 325 326If we want to introduce additional regressors, we list them after the 327dependent variable in the same way as other gretl commands, such as 328\texttt{ols}. For the difference orthogonality relations, 329\texttt{dpanel} takes care of transforming the regressors in parallel 330with the dependent variable. 331 332One case of potential ambiguity is when an intercept is specified but 333the difference-only estimator is selected, as in 334\begin{code} 335 dpanel 1 ; y const 336\end{code} 337In this case the default \texttt{dpanel} behavior, which agrees with 338Stata's \texttt{xtabond2}, is to drop the constant (since differencing 339reduces it to nothing but zeros). However, for compatibility with the 340DPD package for Ox, you can give the option \verb|--dpdstyle|, in 341which case the constant is retained (equivalent to including a linear 342trend in equation~\ref{eq:dpd-def}). A similar point applies to the 343period-specific dummy variables which can be added in \texttt{dpanel} 344via the \verb|--time-dummies| option: in the differences-only case 345these dummies are entered in differenced form by default, but when the 346\verb|--dpdstyle| switch is applied they are entered in levels. 347 348The standard gretl syntax applies if you want to use lagged 349explanatory variables, so for example the command 350\begin{code} 351 dpanel 1 ; y const x(0 to -1) --system 352\end{code} 353would result in estimation of the model 354\[ 355 y_{it} = \alpha y_{i,t-1} + 356 \beta_0 + \beta_1 x_{it} + \beta_2 x_{i,t-1} + 357 \eta_{i} + v_{it} . 358\] 359 360 361\subsection{Instruments} 362 363The default rules for instruments are: 364\begin{itemize} 365\item lags of the dependent variable are instrumented using all 366 available orthogonality conditions; and 367\item additional regressors are considered exogenous, so they are used 368 as their own instruments. 369\end{itemize} 370 371If a different policy is wanted, the instruments should be specified 372in an additional list, separated from the regressors list by a 373semicolon. The syntax closely mirrors that for the \texttt{tsls} 374command, but in this context it is necessary to distinguish between 375``regular'' instruments and what are often called ``GMM-style'' 376instruments (that is, instruments that are handled in the same 377block-diagonal manner as lags of the dependent variable, as described 378above). 379 380``Regular'' instruments are transformed in the same way as 381regressors, and the contemporaneous value of the transformed variable 382is used to form an orthogonality condition. Since regressors are 383treated as exogenous by default, it follows that these two commands 384estimate the same model: 385 386\begin{code} 387 dpanel 1 ; y z 388 dpanel 1 ; y z ; z 389\end{code} 390The instrument specification in the second case simply confirms what 391is implicit in the first: that \texttt{z} is exogenous. Note, though, 392that if you have some additional variable \texttt{z2} which you want 393to add as a regular instrument, it then becomes necessary to 394include \texttt{z} in the instrument list if it is to be treated 395as exogenous: 396\begin{code} 397 dpanel 1 ; y z ; z2 # z is now implicitly endogenous 398 dpanel 1 ; y z ; z z2 # z is treated as exogenous 399\end{code} 400 401The specification of ``GMM-style'' instruments is handled by the 402special constructs \texttt{GMM()} and \texttt{GMMlevel()}. The first 403of these relates to instruments for the equations in differences, and 404the second to the equations in levels. The syntax for \texttt{GMM()} 405is 406 407\begin{altcode} 408\texttt{GMM(}\textsl{name}\texttt{,} \textsl{minlag}\texttt{,} 409\textsl{maxlag}\texttt{)} 410\end{altcode} 411 412\noindent 413where \textsl{name} is replaced by the name of a series (or the name 414of a list of series), and \textsl{minlag} and \textsl{maxlag} are 415replaced by the minimum and maximum lags to be used as 416instruments. The same goes for \texttt{GMMlevel()}. 417 418One common use of \texttt{GMM()} is to limit the number of lagged 419levels of the dependent variable used as instruments for the equations 420in differences. It's well known that although exploiting all possible 421orthogonality conditions yields maximal asymptotic efficiency, in 422finite samples it may be preferable to use a smaller subset (but see 423also \cite{OkuiJoE2009}). For example, the specification 424 425\begin{code} 426 dpanel 1 ; y ; GMM(y, 2, 4) 427\end{code} 428ensures that no lags of $y_t$ earlier than $t-4$ will be used as 429instruments. 430 431A second use of \texttt{GMM()} is to exploit more fully the potential 432block-diagonal orthogonality conditions offered by an exogenous 433regressor, or a related variable that does not appear as a regressor. 434For example, in 435 436\begin{code} 437 dpanel 1 ; y x ; GMM(z, 2, 6) 438\end{code} 439the variable \texttt{x} is considered an endogenous regressor, and up to 4405 lags of \texttt{z} are used as instruments. 441 442Note that in the following script fragment 443\begin{code} 444 dpanel 1 ; y z 445 dpanel 1 ; y z ; GMM(z,0,0) 446\end{code} 447the two estimation commands should not be expected to give the same 448result, as the sets of orthogonality relationships are subtly 449different. In the latter case, you have $T-2$ separate orthogonality 450relationships pertaining to $z_{it}$, none of which has any 451implication for the other ones; in the former case, you only have one. 452In terms of the $\bZ_i$ matrix, the first form adds a single row to 453the bottom of the instruments matrix, while the second form adds a 454diagonal block with $T-2$ columns; that is, 455\[ 456 \left[ \begin{array}{cccc} 457 z_{i3} & z_{i4} & \cdots & z_{it} 458 \end{array} \right] 459\] 460versus 461\[ 462 \left[ \begin{array}{cccc} 463 z_{i3} & 0 & \cdots & 0 \\ 464 0 & z_{i4} & \cdots & 0 \\ 465 & \ddots & \ddots & \\ 466 0 & 0 & \cdots & z_{it} 467 \end{array} \right] 468\] 469 470\section{Replication of DPD results} 471\label{sec:DPD-replic} 472 473In this section we show how to replicate the results of some of the 474pioneering work with dynamic panel-data estimators by Arellano, Bond 475and Blundell. As the DPD manual \citep*{DPDmanual} explains, it is 476difficult to replicate the original published results exactly, for two 477main reasons: not all of the data used in those studies are publicly 478available; and some of the choices made in the original software 479implementation of the estimators have been superseded. Here, 480therefore, our focus is on replicating the results obtained using the 481current DPD package and reported in the DPD manual. 482 483The examples are based on the program files \texttt{abest1.ox}, 484\texttt{abest3.ox} and \texttt{bbest1.ox}. These are included in the 485DPD package, along with the Arellano--Bond database files 486\texttt{abdata.bn7} and \texttt{abdata.in7}.\footnote{See 487 \url{http://www.doornik.com/download.html}.} The 488Arellano--Bond data are also provided with gretl, in the file 489\texttt{abdata.gdt}. In the following we do not show the output from 490DPD or gretl; it is somewhat voluminous, and is easily generated by 491the user. As of this writing the results from Ox/DPD and gretl are 492identical in all relevant respects for all of the examples 493shown.\footnote{To be specific, this is using Ox Console version 5.10, 494 version 1.24 of the DPD package, and gretl built from CVS as of 495 2010-10-23, all on Linux.} 496 497A complete Ox/DPD program to generate the results of interest takes 498this general form: 499 500\begin{code} 501#include <oxstd.h> 502#import <packages/dpd/dpd> 503 504main() 505{ 506 decl dpd = new DPD(); 507 508 dpd.Load("abdata.in7"); 509 dpd.SetYear("YEAR"); 510 511 // model-specific code here 512 513 delete dpd; 514} 515\end{code} 516% 517In the examples below we take this template for granted and show just 518the model-specific code. 519 520\subsection{Example 1} 521 522The following Ox/DPD code---drawn from \texttt{abest1.ox}---replicates 523column (b) of Table 4 in \cite{arellano-bond91}, an instance of the 524differences-only or GMM-DIF estimator. The dependent variable is the 525log of employment, \texttt{n}; the regressors include two lags of the 526dependent variable, current and lagged values of the log real-product 527wage, \texttt{w}, the current value of the log of gross capital, 528\texttt{k}, and current and lagged values of the log of industry 529output, \texttt{ys}. In addition the specification includes a constant 530and five year dummies; unlike the stochastic regressors, these 531deterministic terms are not differenced. In this specification the 532regressors \texttt{w}, \texttt{k} and \texttt{ys} are treated as 533exogenous and serve as their own instruments. In DPD syntax this 534requires entering these variables twice, on the \verb|X_VAR| and 535\verb|I_VAR| lines. The GMM-type (block-diagonal) instruments in this 536example are the second and subsequent lags of the level of \texttt{n}. 537Both 1-step and 2-step estimates are computed. 538 539\begin{code} 540dpd.SetOptions(FALSE); // don't use robust standard errors 541dpd.Select(Y_VAR, {"n", 0, 2}); 542dpd.Select(X_VAR, {"w", 0, 1, "k", 0, 0, "ys", 0, 1}); 543dpd.Select(I_VAR, {"w", 0, 1, "k", 0, 0, "ys", 0, 1}); 544 545dpd.Gmm("n", 2, 99); 546dpd.SetDummies(D_CONSTANT + D_TIME); 547 548print("\n\n***** Arellano & Bond (1991), Table 4 (b)"); 549dpd.SetMethod(M_1STEP); 550dpd.Estimate(); 551dpd.SetMethod(M_2STEP); 552dpd.Estimate(); 553\end{code} 554 555Here is gretl code to do the same job: 556 557\begin{code} 558open abdata.gdt 559list X = w w(-1) k ys ys(-1) 560dpanel 2 ; n X const --time-dummies --asy --dpdstyle 561dpanel 2 ; n X const --time-dummies --asy --two-step --dpdstyle 562\end{code} 563 564Note that in gretl the switch to suppress robust standard errors is 565\verb|--asymptotic|, here abbreviated to \verb|--asy|.\footnote{Option 566 flags in gretl can always be truncated, down to the minimal unique 567 abbreviation.} The \verb|--dpdstyle| flag specifies that the 568constant and dummies should not be differenced, in the context of a 569GMM-DIF model. With gretl's \texttt{dpanel} command it is not 570necessary to specify the exogenous regressors as their own instruments 571since this is the default; similarly, the use of the second and all 572longer lags of the dependent variable as GMM-type instruments is the 573default and need not be stated explicitly. 574 575\subsection{Example 2} 576 577The DPD file \texttt{abest3.ox} contains a variant of the above that 578differs with regard to the choice of instruments: the variables 579\texttt{w} and \texttt{k} are now treated as predetermined, and are 580instrumented GMM-style using the second and third lags of their 581levels. This approximates column (c) of Table 4 in 582\cite{arellano-bond91}. We have modified the code in 583\texttt{abest3.ox} slightly to allow the use of robust 584(Windmeijer-corrected) standard errors, which are the default in both 585DPD and gretl with 2-step estimation: 586 587\begin{code} 588dpd.Select(Y_VAR, {"n", 0, 2}); 589dpd.Select(X_VAR, {"w", 0, 1, "k", 0, 0, "ys", 0, 1}); 590dpd.Select(I_VAR, {"ys", 0, 1}); 591dpd.SetDummies(D_CONSTANT + D_TIME); 592 593dpd.Gmm("n", 2, 99); 594dpd.Gmm("w", 2, 3); 595dpd.Gmm("k", 2, 3); 596 597print("\n***** Arellano & Bond (1991), Table 4 (c)\n"); 598print(" (but using different instruments!!)\n"); 599dpd.SetMethod(M_2STEP); 600dpd.Estimate(); 601\end{code} 602 603The gretl code is as follows: 604 605\begin{code} 606open abdata.gdt 607list X = w w(-1) k ys ys(-1) 608list Ivars = ys ys(-1) 609dpanel 2 ; n X const ; GMM(w,2,3) GMM(k,2,3) Ivars --time --two-step --dpd 610\end{code} 611% 612Note that since we are now calling for an instrument set other then 613the default (following the second semicolon), it is necessary to 614include the \texttt{Ivars} specification for the variable \texttt{ys}. 615However, it is not necessary to specify \texttt{GMM(n,2,99)} since 616this remains the default treatment of the dependent variable. 617 618\subsection{Example 3} 619 620Our third example replicates the DPD output from \texttt{bbest1.ox}: 621this uses the same dataset as the previous examples but the model 622specifications are based on \cite{blundell-bond98}, and involve 623comparison of the GMM-DIF and GMM-SYS (``system'') estimators. The 624basic specification is slightly simplified in that the variable 625\texttt{ys} is not used and only one lag of the dependent variable 626appears as a regressor. The Ox/DPD code is: 627 628\begin{code} 629dpd.Select(Y_VAR, {"n", 0, 1}); 630dpd.Select(X_VAR, {"w", 0, 1, "k", 0, 1}); 631dpd.SetDummies(D_CONSTANT + D_TIME); 632 633print("\n\n***** Blundell & Bond (1998), Table 4: 1976-86 GMM-DIF"); 634dpd.Gmm("n", 2, 99); 635dpd.Gmm("w", 2, 99); 636dpd.Gmm("k", 2, 99); 637dpd.SetMethod(M_2STEP); 638dpd.Estimate(); 639 640print("\n\n***** Blundell & Bond (1998), Table 4: 1976-86 GMM-SYS"); 641dpd.GmmLevel("n", 1, 1); 642dpd.GmmLevel("w", 1, 1); 643dpd.GmmLevel("k", 1, 1); 644dpd.SetMethod(M_2STEP); 645dpd.Estimate(); 646\end{code} 647 648Here is the corresponding gretl code: 649 650\begin{code} 651open abdata.gdt 652list X = w w(-1) k k(-1) 653list Z = w k 654 655# Blundell & Bond (1998), Table 4: 1976-86 GMM-DIF 656dpanel 1 ; n X const ; GMM(Z,2,99) --time --two-step --dpd 657 658# Blundell & Bond (1998), Table 4: 1976-86 GMM-SYS 659dpanel 1 ; n X const ; GMM(Z,2,99) GMMlevel(Z,1,1) \ 660 --time --two-step --dpd --system 661\end{code} 662 663Note the use of the \verb|--system| option flag to specify GMM-SYS, 664including the default treatment of the dependent variable, which 665corresponds to \texttt{GMMlevel(n,1,1)}. In this case we also want to 666use lagged differences of the regressors \texttt{w} and \texttt{k} as 667instruments for the levels equations so we need explicit 668\texttt{GMMlevel} entries for those variables. If you want something 669other than the default treatment for the dependent variable as an 670instrument for the levels equations, you should give an explicit 671\texttt{GMMlevel} specification for that variable---and in that case 672the \verb|--system| flag is redundant (but harmless). 673 674For the sake of completeness, note that if you specify at least one 675\texttt{GMMlevel} term, \texttt{dpanel} will then include equations in 676levels, but it will not automatically add a default \texttt{GMMlevel} 677specification for the dependent variable unless the \verb|--system| 678option is given. 679 680\section{Cross-country growth example} 681\label{sec:dpanel-growth} 682 683The previous examples all used the Arellano--Bond dataset; for this 684example we use the dataset \texttt{CEL.gdt}, which is also included in 685the gretl distribution. As with the Arellano--Bond data, there are 686numerous missing values. Details of the provenance of the data can be 687found by opening the dataset information window in the gretl GUI 688(\textsf{Data} menu, \textsf{Dataset info} item). This is a subset of 689the Barro--Lee 138-country panel dataset, an approximation to which is 690used in \citet*{CEL96} and \citet*{Bond2001}.\footnote{We say an 691 ``approximation'' because we have not been able to replicate exactly 692 the OLS results reported in the papers cited, though it seems from 693 the description of the data in \cite{CEL96} that we ought to be able 694 to do so. We note that \cite{Bond2001} used data provided by 695 Professor Caselli yet did not manage to reproduce the latter's 696 results.} Both of these papers explore the dynamic panel-data 697approach in relation to the issues of growth and convergence of per 698capita income across countries. 699 700The dependent variable is growth in real GDP per capita over 701successive five-year periods; the regressors are the log of the 702initial (five years prior) value of GDP per capita, the log-ratio of 703investment to GDP, $s$, in the prior five years, and the log of annual 704average population growth, $n$, over the prior five years plus 0.05 as 705stand-in for the rate of technical progress, $g$, plus the rate of 706depreciation, $\delta$ (with the last two terms assumed to be constant 707across both countries and periods). The original model is 708\begin{equation} 709\label{eq:CEL96} 710\Delta_5 y_{it} = \beta y_{i,t-5} + \alpha s_{it} + \gamma (n_{it} + 711g + \delta) + \nu_t + \eta_i + \epsilon_{it} 712\end{equation} 713which allows for a time-specific disturbance $\nu_t$. The Solow model 714with Cobb--Douglas production function implies that $\gamma = 715-\alpha$, but this assumption is not imposed in estimation. The 716time-specific disturbance is eliminated by subtracting the period mean 717from each of the series. 718 719Equation (\ref{eq:CEL96}) can be transformed to an AR(1) dynamic 720panel-data model by adding $y_{i,t-5}$ to both sides, which gives 721\begin{equation} 722\label{eq:CEL96a} 723y_{it} = (1 + \beta) y_{i,t-5} + \alpha s_{it} + \gamma (n_{it} + 724g + \delta) + \eta_i + \epsilon_{it} 725\end{equation} 726where all variables are now assumed to be time-demeaned. 727 728In (rough) replication of \cite{Bond2001} we now proceed to estimate 729the following two models: (a) equation (\ref{eq:CEL96a}) via GMM-DIF, 730using as instruments the second and all longer lags of $y_{it}$, 731$s_{it}$ and $n_{it} + g + \delta$; and (b) equation 732(\ref{eq:CEL96a}) via GMM-SYS, using $\Delta y_{i,t-1}$, $\Delta 733s_{i,t-1}$ and $\Delta (n_{i,t-1} + g + \delta)$ as additional 734instruments in the levels equations. We report robust standard errors 735throughout. (As a purely notational matter, we now use ``$t-1$'' to 736refer to values five years prior to $t$, as in \cite{Bond2001}). 737 738The gretl script to do this job is shown below. Note that the final 739transformed versions of the variables (logs, with time-means 740subtracted) are named \texttt{ly} ($y_{it}$), \texttt{linv} ($s_{it}$) 741and \texttt{lngd} ($n_{it} + g + \delta$). 742% 743\begin{code} 744open CEL.gdt 745 746ngd = n + 0.05 747ly = log(y) 748linv = log(s) 749lngd = log(ngd) 750 751# take out time means 752loop i=1..8 753 smpl (time == i) --restrict --replace 754 ly -= mean(ly) 755 linv -= mean(linv) 756 lngd -= mean(lngd) 757endloop 758 759smpl --full 760list X = linv lngd 761# 1-step GMM-DIF 762dpanel 1 ; ly X ; GMM(X,2,99) 763# 2-step GMM-DIF 764dpanel 1 ; ly X ; GMM(X,2,99) --two-step 765# GMM-SYS 766dpanel 1 ; ly X ; GMM(X,2,99) GMMlevel(X,1,1) --two-step --sys 767\end{code} 768 769For comparison we estimated the same two models using Ox/DPD and the 770Stata command \texttt{xtabond2}. (In each case we constructed a 771comma-separated values dataset containing the data as transformed in 772the gretl script shown above, using a missing-value code appropriate 773to the target program.) For reference, the commands used with 774Stata are reproduced below: 775% 776\begin{code} 777#delimit ; 778insheet using CEL.csv 779tsset unit time; 780xtabond2 ly L.ly linv lngd, gmm(L.ly, lag(1 99)) gmm(linv, lag(2 99)) 781 gmm(lngd, lag(2 99)) rob nolev; 782xtabond2 ly L.ly linv lngd, gmm(L.ly, lag(1 99)) gmm(linv, lag(2 99)) 783 gmm(lngd, lag(2 99)) rob nolev twostep; 784xtabond2 ly L.ly linv lngd, gmm(L.ly, lag(1 99)) gmm(linv, lag(2 99)) 785 gmm(lngd, lag(2 99)) rob nocons twostep; 786\end{code} 787 788For the GMM-DIF model all three programs find 382 usable observations 789and 30 instruments, and yield identical parameter estimates and 790robust standard errors (up to the number of digits printed, or more); 791see Table~\ref{tab:growth-DIF}.\footnote{The coefficient shown for 792 \texttt{ly(-1)} in the Tables is that reported directly by the 793 software; for comparability with the original model (eq.\ 794 \ref{eq:CEL96}) it is necesary to subtract 1, which produces the 795 expected negative value indicating conditional convergence in per 796 capita income.} 797 798\begin{table}[htbp] 799\begin{center} 800\begin{tabular}{lrrrr} 801& \multicolumn{2}{c}{1-step} & \multicolumn{2}{c}{2-step} \\ 802& \multicolumn{1}{c}{coeff} & \multicolumn{1}{c}{std.\ error} & 803 \multicolumn{1}{c}{coeff} & \multicolumn{1}{c}{std.\ error} \\ 804\texttt{ly(-1)} & 0.577564 & 0.1292 & 0.610056 & 0.1562 \\ 805\texttt{linv} & 0.0565469 & 0.07082 & 0.100952 & 0.07772 \\ 806\texttt{lngd} & $-$0.143950 & 0.2753 & $-$0.310041 & 0.2980 \\ 807\end{tabular} 808\caption{GMM-DIF: Barro--Lee data} 809\label{tab:growth-DIF} 810\end{center} 811\end{table} 812 813Results for GMM-SYS estimation are shown in 814Table~\ref{tab:growth-SYS}. In this case we show two sets of gretl 815results: those labeled ``gretl(1)'' were obtained using gretl's 816\verb|--dpdstyle| option, while those labeled ``gretl(2)'' did not use 817that option---the intent being to reproduce the $H$ matrices used by 818Ox/DPD and \texttt{xtabond2} respectively. 819 820\begin{table}[htbp] 821\begin{center} 822\begin{tabular}{lrrrr} 823& \multicolumn{1}{c}{gretl(1)} & 824 \multicolumn{1}{c}{Ox/DPD} & 825 \multicolumn{1}{c}{gretl(2)} & 826 \multicolumn{1}{c}{xtabond2} \\ 827\texttt{ly(-1)} & 0.9237 (0.0385) & 828 0.9167 (0.0373) & 829 0.9073 (0.0370) & 830 0.9073 (0.0370) \\ 831\texttt{linv} & 0.1592 (0.0449) & 832 0.1636 (0.0441) & 833 0.1856 (0.0411) & 834 0.1856 (0.0411) \\ 835\texttt{lngd} & $-$0.2370 (0.1485) & 836 $-$0.2178 (0.1433) & 837 $-$0.2355 (0.1501) & 838 $-$0.2355 (0.1501) 839\end{tabular} 840\caption{2-step GMM-SYS: Barro--Lee data (standard errors in parentheses)} 841\label{tab:growth-SYS} 842\end{center} 843\end{table} 844 845In this case all three programs use 479 observations; gretl and 846\texttt{xtabond2} use 41 instruments and produce the same estimates 847(when using the same $H$ matrix) while Ox/DPD nominally uses 84866.\footnote{This is a case of the issue described in 849 section~\ref{sec:rankdef}: the full $\bA_N$ matrix turns out to be 850 singular and special measures must be taken to produce estimates.} 851It is noteworthy that with GMM-SYS plus ``messy'' missing 852observations, the results depend on the precise array of instruments 853used, which in turn depends on the details of the implementation of 854the estimator. 855 856\section{Auxiliary test statistics} 857\label{sec:dpanel-aux} 858 859We have concentrated above on the parameter estimates and standard 860errors. Here we add a few words on the additional test statistics that 861typically accompany both GMM-DIF and GMM-SYS estimation. These include 862the Sargan test for overidentification, one or more Wald tests for the 863joint significance of the regressors (and time dummies, if applicable) 864and tests for first- and second-order autocorrelation of the residuals 865from the equations in differences. 866 867As in Ox/DPD, the Sargan test statistic reported by gretl is 868\[ 869 S = \left(\sum_{i=1}^N \hat{\bv}^{*\prime}_i \bZ_i\right) 870 \bA_N \left(\sum_{i=1}^N \bZ_i' \hat{\bv}^*_i\right) 871\] 872where the $\hat{\bv}^*_i$ are the transformed (e.g.\ differenced) 873residuals for unit $i$. Under the null hypothesis that the 874instruments are valid, $S$ is asymptotically distributed as chi-square 875with degrees of freedom equal to the number of overidentifying 876restrictions. 877 878In general we see a good level of agreement between gretl, DPD and 879\texttt{xtabond2} with regard to these statistics, with a few 880relatively minor exceptions. Specifically, \texttt{xtabond2} computes 881both a ``Sargan test'' and a ``Hansen test'' for overidentification, 882but what it calls the Hansen test is, apparently, what DPD calls the 883Sargan test. (We have had difficulty determining from the 884\texttt{xtabond2} documentation \citep{Roodman2006} exactly how its 885Sargan test is computed.) In addition there are cases where the 886degrees of freedom for the Sargan test differ between DPD and gretl; 887this occurs when the $\bA_N$ matrix is singular 888(section~\ref{sec:rankdef}). In concept the df equals the number of 889instruments minus the number of parameters estimated; for the first of 890these terms gretl uses the rank of $\bA_N$, while DPD appears to use 891the full dimension of this matrix. 892 893Negative first-order autocorrelation of the residuals is expected by 894construction of the estimator, so a significant value for the AR(1) 895test does not indicate a problem. If the AR(2) test is significant, 896however, this indicates violation of the maintained assumptions. Note 897that valid AR tests cannot be produced when the \verb|--asymptotic| 898option is specified in conjunction with one-step GMM-SYS estimation; 899if you need the tests, either add the \verb|two-step| option or drop 900the asymptotic flag (which is recommended in any case). 901 902\section{Post-estimation available statistics} 903\label{sec:dpanel-post} 904 905After estimation, the \dollar{model} accessor will return a bundle 906containing several items that may be of interest: most should be 907self-explanatory, but here's a partial list: 908 909\begin{center} 910\begin{tabular}{rp{0.6\textwidth}} 911 \hline 912 \textbf{Key} & \textbf{Content} \\ 913 \hline 914 \texttt{AR1}, \texttt{AR2} & 1st and 2nd order autocorrelation test 915 statistics \\ 916 \texttt{sargan}, \texttt{sargan\_df} & Sargan test for 917 overidentifying restrictions 918 and corresponding degrees of freedom \\ 919 \texttt{wald}, \texttt{wald\_df} & Wald test for 920 overall significance 921 and corresponding degrees of 922 freedom \\ 923 \texttt{GMMinst} & The matrix $\bZ$ of instruments (see equations 924 (\ref{eq:dpd-dif}) and (\ref{eq:sys-gmm}) \\ 925 \texttt{wgtmat} & The matrix $\bA$ of GMM weights (see equations 926 (\ref{eq:dpd-dif}) and (\ref{eq:sys-gmm}) \\ 927 \hline 928\end{tabular} 929\end{center} 930 931Note, however, that \texttt{GMMinst} and \texttt{wgtmat} (which may be 932quite large matrices) are not saved in the \dollar{model} bundle by 933default; that requires use of the \option{keep-extra} option with the 934\cmd{dpanel} command. Listing~\ref{ex:dpanel-rep} illustrates use 935of these matrices to replicate via hansl commands the calculation of 936the GMM estimator. 937 938\begin{script}[p] 939 \scriptcaption{replication of built-in command via hansl commands} 940 \label{ex:dpanel-rep} 941\begin{scode} 942set verbose off 943open abdata.gdt 944 945# compose list of regressors 946list X = w w(-1) k k(-1) 947list Z = w k 948 949dpanel 1 ; n X const ; GMM(Z,2,99) --two-step --dpd --keep-extra 950 951### --- re-do by hand ---------------------------- 952 953# fetch Z and A from model 954A = $model.wgtmat 955mZt = $model.GMMinst # note: transposed 956 957# create data matrices 958series valid = ok($uhat) 959series ddep = diff(n) 960series dldep = ddep(-1) 961list dreg = diff(X) 962 963smpl valid --dummy 964 965matrix m_reg = {dldep} ~ {dreg} ~ 1 966matrix m_dep = {ddep} 967 968matrix uno = mZt * m_reg 969matrix due = qform(uno', A) 970matrix tre = (uno'A) * (mZt * m_dep) 971matrix coef = due\tre 972 973print coef 974\end{scode} 975\end{script} 976 977 978\section{Memo: \texttt{dpanel} options} 979\label{sec:dpanel-options} 980 981\begin{center} 982\begin{tabular}{lp{.7\textwidth}} 983 \textit{flag} & \textit{effect} \\ [6pt] 984 \verb|--asymptotic| & Suppresses the use of robust standard errors \\ 985 \verb|--two-step| & Calls for 2-step estimation (the default being 1-step) \\ 986 \verb|--system| & Calls for GMM-SYS, with default treatment of the 987 dependent variable, as in \texttt{GMMlevel(y,1,1)} \\ 988 \verb|--time-dummies| & Includes period-specific dummy variables \\ 989 \verb|--dpdstyle| & Compute the $H$ matrix as in DPD; also suppresses 990 differencing of automatic time dummies and omission of intercept 991 in the GMM-DIF case\\ 992 \verb|--verbose| & Prints confirmation of the GMM-style instruments 993 used; and when \verb|--two-step| is selected, prints 994 the 1-step estimates first \\ 995 \verb|--vcv| & Calls for printing of the covariance matrix \\ 996 \verb|--quiet| & Suppresses the printing of results \\ 997 \verb|--keep-extra| & Save additional matrices in \dollar{model} 998 bundle (see above) \\ 999\end{tabular} 1000\end{center} 1001 1002The time dummies option supports the qualifier \texttt{noprint}, as 1003in 1004\begin{code} 1005--time-dummies=noprint 1006\end{code} 1007 1008This means that although the dummies are included in the specification 1009their coefficients, standard errors and so on are not printed. 1010 1011%%% Local Variables: 1012%%% mode: latex 1013%%% TeX-master: "gretl-guide" 1014%%% End: 1015