\chapter{Nonlinear least squares} \label{chap:nls} \section{Introduction and examples} \label{nls-intro} Gretl supports nonlinear least squares (NLS) using a variant of the Levenberg--Marquardt algorithm. The user must supply a specification of the regression function; prior to giving this specification the parameters to be estimated must be ``declared'' and given initial values. Optionally, the user may supply analytical derivatives of the regression function with respect to each of the parameters. If derivatives are not given, the user must instead give a list of the parameters to be estimated (separated by spaces or commas), preceded by the keyword \cmd{params}. The tolerance (criterion for terminating the iterative estimation procedure) can be adjusted using the \cmd{set} command. The syntax for specifying the function to be estimated consists of the name of the dependent variable, followed by an expression to generate it. This is illustrated in the following two examples, with accompanying derivatives. \begin{code} # Consumption function from Greene nls C = alpha + beta * Y^gamma deriv alpha = 1 deriv beta = Y^gamma deriv gamma = beta * Y^gamma * log(Y) end nls # Nonlinear function from Russell Davidson nls y = alpha + beta * x1 + (1/beta) * x2 deriv alpha = 1 deriv beta = x1 - x2/(beta*beta) end nls --vcv \end{code} Note the command words \cmd{nls} (which introduces the regression function), \cmd{deriv} (which introduces the specification of a derivative), and \cmd{end nls}, which terminates the specification and calls for estimation. If the \cmd{--vcv} flag is appended to the last line the covariance matrix of the parameter estimates is printed. \section{Initializing the parameters} \label{nls-param} The parameters of the regression function must be given initial values prior to the \cmd{nls} command. (In the GUI program this may be done via the menu item ``Variable, Define new variable''). In some cases, where the nonlinear function is a generalization of (or a restricted form of) a linear model, it may be convenient to run an \cmd{ols} and initialize the parameters from the OLS coefficient estimates. In relation to the first example above, one might do: % \begin{code} ols C 0 Y alpha = $coeff(0) beta = $coeff(Y) gamma = 1 \end{code} And in relation to the second example one might do: % \begin{code} ols y 0 x1 x2 alpha = $coeff(0) beta = $coeff(x1) \end{code} \section{NLS dialog window} \label{nls-gui} It is probably most convenient to compose the commands for NLS estimation in the form of a gretl script but you can also do so interactively, by selecting the item ``Nonlinear Least Squares'' under the ``Model, Nonlinear models'' menu. This opens a dialog box where you can type the function specification (possibly prefaced by statements to set the initial parameter values) and the derivatives, if available. An example of this is shown in Figure~\ref{fig-nls-dialog}. Note that in this context you do not have to supply the \cmd{nls} and \cmd{end nls} tags. \begin{figure}[htbp] \begin{center} \includegraphics[scale=0.5]{figures/nls_window} \end{center} \caption{NLS dialog box} \label{fig-nls-dialog} \end{figure} \section{Analytical and numerical derivatives} \label{nls-deriv} If you are able to figure out the derivatives of the regression function with respect to the parameters, it is advisable to supply those derivatives as shown in the examples above. If that is not possible, gretl will compute approximate numerical derivatives. However, the properties of the NLS algorithm may not be so good in this case (see section~\ref{nls-accuracy}). This is done by using the \cmd{params} statement, which should be followed by a list of identifiers containing the parameters to be estimated. In this case, the examples above would read as follows: \begin{code} # Greene nls C = alpha + beta * Y^gamma params alpha beta gamma end nls \end{code} \begin{code} # Davidson nls y = alpha + beta * x1 + (1/beta) * x2 params alpha beta end nls \end{code} If analytical derivatives are supplied, they are checked for consistency with the given nonlinear function. If the derivatives are clearly incorrect estimation is aborted with an error message. If the derivatives are ``suspicious'' a warning message is issued but estimation proceeds. This warning may sometimes be triggered by incorrect derivatives, but it may also be triggered by a high degree of collinearity among the derivatives. Note that you cannot mix analytical and numerical derivatives: you should supply expressions for all of the derivatives or none. \section{Advanced use} \label{sec:nls-generalize} The \cmd{nls} block can also contain more sophisticated constructs. First, it can handle intermediate expressions; this makes it possible to construct the conditional mean expression as a multi-step job, thus enhancing modularity and readability of the code. Second, more complex objects, such as lists and matrices, can be used for this purpose. For example, suppose that we want to estimate a Probit Binary Response model via NLS. The specification is \begin{equation} \label{eq:probit-nls} y_i = \Phi\left[g(\bm{x}_i)\right] + u_i, \qquad g(\bm{x}_i) = b_0 + b_1 x_{1,i} + b_2 x_{2,i} = \bm{b}'\bm{x}_i \end{equation} Note: this is not the recommended way to estimate a probit model: the $u_i$ term is heteroskedastic by construction and ML estimation is much preferable here. Still, NLS is a consistent estimator of the parameter vector $\bm{b}$, although its covariance matrix will have to be adjusted to compensate for heteroskedasticity: this is accomplished via the \option{robust} switch. \begin{script}[htbp] \scriptcaption{NLS estimation of a Probit model} \label{script:nls-probit-numder} \begin{scode} open greene25_1.gdt list X = const age income ownrent selfempl # initalisation ols cardhldr X --quiet matrix b = $coeff / $sigma # proceed with NLS estimation nls cardhldr = cnorm(ndx) series ndx = lincomb(X, b) params b end nls --robust # compare with ML probit probit cardhldr X --p-values \end{scode} \end{script} The example in script \ref{script:nls-probit-numder} can be enhanced by using analytical derivatives: since \[ \frac{\partial g(\bm{x}_i)}{\partial b_j} = \varphi(\bm{b}'\bm{x}_i) \cdot x_{ij} \] one could substitute the \cmd{params} line in the script with the two-liner \begin{code} series f = dnorm(ndx) deriv b = {f} .* {X} \end{code} and have \cmd{nls} use analytically-computed derivatives, which are quicker and usually more reliable. \section{Controlling termination} \label{nls-toler} The NLS estimation procedure is an iterative process. Iteration is terminated when the criterion for convergence is met or when the maximum number of iterations is reached, whichever comes first. Let $k$ denote the number of parameters being estimated. The maximum number of iterations is $100 \times (k+1)$ when analytical derivatives are given, and $200 \times (k+1)$ when numerical derivatives are used. Let $\epsilon$ denote a small number. The iteration is deemed to have converged if at least one of the following conditions is satisfied: \begin{itemize} \item Both the actual and predicted relative reductions in the error sum of squares are at most $\epsilon$. \item The relative error between two consecutive iterates is at most $\epsilon$. \end{itemize} This default value of $\epsilon$ is the machine precision to the power 3/4,\footnote{On a 32-bit Intel Pentium machine a likely value for this parameter is $1.82\times 10^{-12}$.} but it can be adjusted using the \cmd{set} command with the parameter \verb+nls_toler+. For example % \begin{code} set nls_toler .0001 \end{code} % will relax the value of $\epsilon$ to 0.0001. \section{Details on the code} \label{nls-code} The underlying engine for NLS estimation is based on the \app{minpack} suite of functions, available from \href{http://www.netlib.org/minpack/}{netlib.org}. Specifically, the following \app{minpack} functions are called: \begin{center} \begin{tabular}{ll} \verb+lmder+ & Levenberg--Marquardt algorithm with analytical derivatives \\ \verb+chkder+ & Check the supplied analytical derivatives \\ \verb+lmdif+ & Levenberg--Marquardt algorithm with numerical derivatives \\ \verb+fdjac2+ & Compute final approximate Jacobian when using numerical derivatives \\ \verb+dpmpar+ & Determine the machine precision \\ \end{tabular} \end{center} On successful completion of the Levenberg--Marquardt iteration, a Gauss--Newton regression is used to calculate the covariance matrix for the parameter estimates. If the \option{robust} flag is given a robust variant is computed. The documentation for the \cmd{set} command explains the specific options available in this regard. Since NLS results are asymptotic, there is room for debate over whether or not a correction for degrees of freedom should be applied when calculating the standard error of the regression (and the standard errors of the parameter estimates). For comparability with OLS, and in light of the reasoning given in \cite{davidson-mackinnon93}, the estimates shown in gretl \emph{do} use a degrees of freedom correction. \section{Numerical accuracy} \label{nls-accuracy} Table \ref{tab-nls} shows the results of running the gretl NLS procedure on the 27 Statistical Reference Datasets made available by the U.S. National Institute of Standards and Technology (NIST) for testing nonlinear regression software.\footnote{For a discussion of gretl's accuracy in the estimation of linear models, see Appendix~\ref{app-accuracy}.} For each dataset, two sets of starting values for the parameters are given in the test files, so the full test comprises 54 runs. Two full tests were performed, one using all analytical derivatives and one using all numerical approximations. In each case the default tolerance was used.\footnote{The data shown in the table were gathered from a pre-release build of gretl version 1.0.9, compiled with \app{gcc} 3.3, linked against \app{glibc} 2.3.2, and run under Linux on an i686 PC (IBM ThinkPad A21m).} Out of the 54 runs, gretl failed to produce a solution in 4 cases when using analytical derivatives, and in 5 cases when using numeric approximation. Of the four failures in analytical derivatives mode, two were due to non-convergence of the Levenberg--Marquardt algorithm after the maximum number of iterations (on \verb+MGH09+ and \verb+Bennett5+, both described by NIST as of ``Higher difficulty'') and two were due to generation of range errors (out-of-bounds floating point values) when computing the Jacobian (on \verb+BoxBOD+ and \verb+MGH17+, described as of ``Higher difficulty'' and ``Average difficulty'' respectively). The additional failure in numerical approximation mode was on \verb+MGH10+ (``Higher difficulty'', maximum number of iterations reached). The table gives information on several aspects of the tests: the number of outright failures, the average number of iterations taken to produce a solution and two sorts of measure of the accuracy of the estimates for both the parameters and the standard errors of the parameters. For each of the 54 runs in each mode, if the run produced a solution the parameter estimates obtained by gretl were compared with the NIST certified values. We define the ``minimum correct figures'' for a given run as the number of significant figures to which the \emph{least accurate} gretl estimate agreed with the certified value, for that run. The table shows both the average and the worst case value of this variable across all the runs that produced a solution. The same information is shown for the estimated standard errors.\footnote{For the standard errors, I excluded one outlier from the statistics shown in the table, namely \verb+Lanczos1+. This is an odd case, using generated data with an almost-exact fit: the standard errors are 9 or 10 orders of magnitude smaller than the coefficients. In this instance gretl could reproduce the certified standard errors to only 3 figures (analytical derivatives) and 2 figures (numerical derivatives).} The second measure of accuracy shown is the percentage of cases, taking into account all parameters from all successful runs, in which the gretl estimate agreed with the certified value to at least the 6 significant figures which are printed by default in the gretl regression output. \begin{table}[htbp] \caption{Nonlinear regression: the NIST tests} \label{tab-nls} \begin{center} \begin{tabular}{lcc} & \textit{Analytical derivatives} & \textit{Numerical derivatives} \\ [4pt] Failures in 54 tests & 4 & 5\\ Average iterations & 32 & 127\\ Mean of min. correct figures, & 8.120 & 6.980\\ parameters \\ Worst of min. correct figures, & 4 & 3\\ parameters \\ Mean of min. correct figures, & 8.000 & 5.673\\ standard errors \\ Worst of min. correct figures, & 5 & 2\\ standard errors \\ Percent correct to at least 6 figures, & 96.5 & 91.9\\ parameters \\ Percent correct to at least 6 figures, & 97.7 & 77.3\\ standard errors \\ \end{tabular} \end{center} \end{table} Using analytical derivatives, the worst case values for both parameters and standard errors were improved to 6 correct figures on the test machine when the tolerance was tightened to 1.0e$-$14. Using numerical derivatives, the same tightening of the tolerance raised the worst values to 5 correct figures for the parameters and 3 figures for standard errors, at a cost of one additional failure of convergence. Note the overall superiority of analytical derivatives: on average solutions to the test problems were obtained with substantially fewer iterations and the results were more accurate (most notably for the estimated standard errors). Note also that the six-digit results printed by gretl are not 100 percent reliable for difficult nonlinear problems (in particular when using numerical derivatives). Having registered this caveat, the percentage of cases where the results were good to six digits or better seems high enough to justify their printing in this form. %%% Local Variables: %%% mode: latex %%% TeX-master: "gretl-guide" %%% End: