1headings 10 2Tests 24 3add 4adf 5bds 6bkw 7chow 8coeffsum 9coint 10cusum 11difftest 12johansen 13kpss 14leverage 15levinlin 16meantest 17modtest 18normtest 19omit 20panspec 21qlrtest 22reset 23restrict 24runs 25vartest 26vif 27Graphs 10 28boxplot 29gnuplot 30graphpg 31hfplot 32panplot 33plot 34qqplot 35rmplot 36scatters 37textplot 38Statistics 14 39anova 40corr 41corrgm 42fractint 43freq 44hurst 45mahal 46pca 47pergm 48pvalue 49spearman 50summary 51xcorrgm 52xtab 53Dataset 18 54append 55data 56dataset 57delete 58genr 59info 60join 61labels 62markers 63nulldata 64open 65rename 66setinfo 67setmiss 68setobs 69smpl 70store 71varlist 72Estimation 34 73ar 74ar1 75arch 76arima 77arma 78biprobit 79dpanel 80duration 81equation 82estimate 83garch 84gmm 85heckit 86hsk 87intreg 88lad 89logistic 90logit 91midasreg 92mle 93mpols 94negbin 95nls 96ols 97panel 98poisson 99probit 100quantreg 101system 102tobit 103tsls 104var 105vecm 106wls 107Programming 19 108break 109catch 110clear 111elif 112else 113end 114endif 115endloop 116flush 117foreign 118funcerr 119function 120if 121include 122loop 123makepkg 124run 125set 126setopt 127Transformations 10 128diff 129discrete 130dummify 131lags 132ldiff 133logs 134orthdev 135sdiff 136square 137stdize 138Utilities 6 139eval 140help 141modeltab 142pkg 143quit 144shell 145Printing 7 146eqnprint 147modprint 148outfile 149print 150printf 151sprintf 152tabprint 153Prediction 1 154fcast 155 156# add Tests 157 158Argument: varlist 159Options: --lm (do an LM test, OLS only) 160 --quiet (print only the basic test result) 161 --silent (don't print anything) 162 --vcv (print covariance matrix for augmented model) 163 --both (IV estimation only, see below) 164Examples: add 5 7 9 165 add xx yy zz --quiet 166 167Must be invoked after an estimation command. Performs a joint test for the 168addition of the specified variables to the last model, the results of which 169may be retrieved using the accessors "$test" and "$pvalue". 170 171By default an augmented version of the original model is estimated, 172including the variables in varlist. The test is a Wald test on the augmented 173model, which replaces the original as the "current model" for the purposes 174of, for example, retrieving the residuals as $uhat or doing further tests. 175 176Alternatively, given the --lm option (available only for the models 177estimated via OLS), an LM test is performed. An auxiliary regression is run 178in which the dependent variable is the residual from the last model and the 179independent variables are those from the last model plus varlist. Under the 180null hypothesis that the added variables have no additional explanatory 181power, the sample size times the unadjusted R-squared from this regression 182is distributed as chi-square with degrees of freedom equal to the number of 183added regressors. In this case the original model is not replaced. 184 185The --both option is specific to two-stage least squares: it specifies that 186the new variables should be added both to the list of regressors and the 187list of instruments, the default in this case being to add to the regressors 188only. 189 190Menu path: Model window, /Tests/Add variables 191 192# adf Tests 193 194Arguments: order varlist 195Options: --nc (test without a constant) 196 --c (with constant only) 197 --ct (with constant and trend) 198 --ctt (with constant, trend and trend squared) 199 --seasonals (include seasonal dummy variables) 200 --gls (de-mean or de-trend using GLS) 201 --verbose (print regression results) 202 --quiet (suppress printing of results) 203 --difference (use first difference of variable) 204 --test-down[=criterion] (automatic lag order) 205 --perron-qu (see below) 206Examples: adf 0 y 207 adf 2 y --nc --c --ct 208 adf 12 y --c --test-down 209 See also jgm-1996.inp 210 211The options shown above and the discussion which follows mostly pertain to 212the use of the adf command with regular time series data. For use of this 213command with panel data please see the section titled "Panel data" below. 214 215This command computes a set of Dickey-Fuller tests on each of the listed 216variables, the null hypothesis being that the variable in question has a 217unit root. (But if the --difference flag is given, the first difference of 218the variable is taken prior to testing, and the discussion below must be 219taken as referring to the transformed variable.) 220 221By default, two variants of the test are shown: one based on a regression 222containing a constant and one using a constant and linear trend. You can 223control the variants that are presented by specifying one or more of the 224option flags --nc, --c, --ct, --ctt. 225 226The --gls option can be used in conjunction with one or other of the flags 227--c and --ct. The effect of this option is that the series to be tested is 228demeaned or detrended using the GLS procedure proposed by Elliott, 229Rothenberg and Stock (1996), which gives a test of greater power than the 230standard Dickey-Fuller approach. This option is not compatible with --nc, 231--ctt or --seasonals. 232 233In all cases the dependent variable in the test regression is the first 234difference of the specified series, y, and the key independent variable is 235the first lag of y. The regression is constructed such that the coefficient 236on lagged y equals the root in question, α, minus 1. For example, the model 237with a constant may be written as 238 239 (1 - L)y(t) = b0 + (a-1)y(t-1) + e(t) 240 241Under the null hypothesis of a unit root the coefficient on lagged y equals 242zero. Under the alternative that y is stationary this coefficient is 243negative. So the test is inherently one-sided. 244 245Selecting the lag order 246 247The simplest version of the Dickey-Fuller test assumes that the error term 248in the test regression is serially uncorrelated. In practice this is 249unlikely to be the case and the specification is often extended by including 250one or more lags of the dependent variable, giving an Augmented 251Dickey-Fuller (ADF) test. The order argument governs the number of such 252lags, k, possibly depending on the sample size, T. 253 254 For a fixed, user-specified k: give a non-negative value for order. 255 256 For T-dependent k: give order as -1. The order is then set following the 257 recommendation of Schwert (1989), namely the integer part of 258 12(T/100)^0.25. 259 260In general, however, we don't know how many lags will be required to 261"whiten" the Dickey-Fuller residual. It's therefore common to specify the 262maximum value of k and let the data decide the actual number of lags to 263include. This can be done via the --test-down option. The criterion for 264selecting optimal k may be set using the parameter to this option, which 265should be one of AIC, BIC or tstat, AIC being the default. 266 267When testing down via AIC or BIC, the final lag order for the ADF equation 268is that which optimizes the chosen information criterion (Akaike or Schwarz 269Bayesian). The exact procedure depends on whether or not the --gls option is 270given. When GLS is specified, AIC and BIC are the "modified" versions 271described in Ng and Perron (2001), otherwise they are the standard versions. 272In the GLS case a refinement is available. If the additional option 273--perron-qu is given, lag-order selection is performed via the revised 274method recommended by Perron and Qu (2007). In this case the data are first 275demeaned or detrended via OLS; GLS is applied once the lag order is 276determined. 277 278When testing down via the t-statistic method is called for, the procedure is 279as follows: 280 2811. Estimate the Dickey-Fuller regression with k lags of the dependent 282 variable. 283 2842. Is the last lag significant? If so, execute the test with lag order k. 285 Otherwise, let k = k - 1; if k equals 0, execute the test with lag order 286 0, else go to step 1. 287 288In the context of step 2 above, "significant" means that the t-statistic for 289the last lag has an asymptotic two-sided p-value, against the normal 290distribution, of 0.10 or less. 291 292To sum up, if we accept the various arguments of Perron, Ng, Qu and Schwert 293referenced above, the favored command for testing a series y is likely to 294be: 295 296 adf -1 y --c --gls --test-down --perron-qu 297 298(Or substitute --ct for --c if the series seems to display a trend.) The lag 299order for the test will then be determined by testing down via modified AIC 300from the Schwert maximum, with the Perron-Qu refinement. 301 302P-values for the Dickey-Fuller tests are based on response-surface 303estimates. When GLS is not applied these are taken from MacKinnon (1996). 304Otherwise they are taken from Cottrell (2015) or, when testing down is 305performed, Sephton (2021). The P-values are specific to the sample size 306unless they are labeled as asymptotic. 307 308Panel data 309 310When the adf command is used with panel data, to produce a panel unit root 311test, the applicable options and the results shown are somewhat different. 312 313First, while you may give a list of variables for testing in the regular 314time-series case, with panel data only one variable may be tested per 315command. Second, the options governing the inclusion of deterministic terms 316become mutually exclusive: you must choose between no-constant, constant 317only, and constant plus trend; the default is constant only. In addition, 318the --seasonals option is not available. Third, the --verbose option has a 319different meaning: it produces a brief account of the test for each 320individual time series (the default being to show only the overall result). 321 322The overall test (null hypothesis: the series in question has a unit root 323for all the panel units) is calculated in one or both of two ways: using the 324method of Im, Pesaran and Shin (Journal of Econometrics, 2003) or that of 325Choi (Journal of International Money and Finance, 2001). The Choi test 326requires that P-values are available for the individual tests; if this is 327not the case (depending on the options selected) it is omitted. The 328particular statistic given for the Im, Pesaran, Shin test varies as follows: 329if the lag order for the test is non-zero their W statistic is shown; 330otherwise if the time-series lengths differ by individual, their Z 331statistic; otherwise their t-bar statistic. See also the "levinlin" command. 332 333Menu path: /Variable/Unit root tests/Augmented Dickey-Fuller test 334 335# anova Statistics 336 337Arguments: response treatment [ block ] 338Option: --quiet (don't print results) 339 340Analysis of Variance: response is a series measuring some effect of interest 341and treatment must be a discrete variable that codes for two or more types 342of treatment (or non-treatment). For two-way ANOVA, the block variable 343(which should also be discrete) codes for the values of some control 344variable. 345 346Unless the --quiet option is given, this command prints a table showing the 347sums of squares and mean squares along with an F-test. The F-test and its 348p-value can be retrieved using the accessors "$test" and "$pvalue" 349respectively. 350 351The null hypothesis for the F-test is that the mean response is invariant 352with respect to the treatment type, or in words that the treatment has no 353effect. Strictly speaking, the test is valid only if the variance of the 354response is the same for all treatment types. 355 356Note that the results shown by this command are in fact a subset of the 357information given by the following procedure, which is easily implemented in 358gretl. Create a set of dummy variables coding for all but one of the 359treatment types. For two-way ANOVA, in addition create a set of dummies 360coding for all but one of the "blocks". Then regress response on a constant 361and the dummies using "ols". For a one-way design the ANOVA table is printed 362via the --anova option to ols. In the two-way case the relevant F-test is 363found by using the "omit" command. For example (assuming y is the response, 364xt codes for the treatment, and xb codes for blocks): 365 366 # one-way 367 list dxt = dummify(xt) 368 ols y 0 dxt --anova 369 # two-way 370 list dxb = dummify(xb) 371 ols y 0 dxt dxb 372 # test joint significance of dxt 373 omit dxt --quiet 374 375Menu path: /Model/Other linear models/ANOVA 376 377# append Dataset 378 379Argument: filename 380Options: --time-series (see below) 381 --fixed-sample (see below) 382 --update-overlap (see below) 383 --quiet (don't print anything) 384 See below for additional specialized options 385 386Opens a data file and appends the content to the current dataset, if the new 387data are compatible. The program will try to detect the format of the data 388file (native, plain text, CSV, Gnumeric, Excel, etc.). 389 390The appended data may take the form of either additional observations on 391series already present in the dataset, and/or new series. In the case of 392adding series, compatibility requires either (a) that the number of 393observations for the new data equals that for the current data, or (b) that 394the new data carries clear observation information so that gretl can work 395out how to place the values. 396 397One case that is not supported is where the new data start earlier and also 398end later than the original data. To add new series in such a case you can 399use the --fixed-sample option; this has the effect of suppressing the adding 400of observations, and so restricting the operation to the addition of new 401series. 402 403A special feature is supported for appending to a panel dataset. Let n 404denote the number of cross-sectional units in the panel, T denote the number 405of time periods, and m denote the number of observations for the new data. 406If m = n the new data are taken to be time-invariant, and are copied into 407place for each time period. On the other hand, if m = T the data are treated 408as non-varying across the panel units, and are copied into place for each 409unit. If the panel is "square", and m equals both n and T, an ambiguity 410arises. The default in this case is to treat the new data as time-invariant, 411but you can force gretl to treat the new data as time series via the 412--time-series option. (This option is ignored in all other cases.) 413 414When a data file is selected for appending, there may be an area of overlap 415with the existing dataset; that is, one or more series may have one or more 416observations in common across the two sources. If the option 417--update-overlap is given, the append operation will replace any overlapping 418observations with the values from the selected data file, otherwise the 419values currently in place will be unaffected. 420 421The additional specialized options --sheet, --coloffset, --rowoffset and 422--fixed-cols work in the same way as with "open"; see that command for 423explanations. 424 425See also "join" for more sophisticated handling of multiple data sources. 426 427Menu path: /File/Append data 428 429# ar Estimation 430 431Arguments: lags ; depvar indepvars 432Options: --vcv (print covariance matrix) 433 --quiet (don't print parameter estimates) 434Example: ar 1 3 4 ; y 0 x1 x2 x3 435 436Computes parameter estimates using the generalized Cochrane-Orcutt iterative 437procedure; see Section 9.5 of Ramanathan (2002). Iteration is terminated 438when successive error sums of squares do not differ by more than 0.005 439percent or after 20 iterations. 440 441"lags" is a list of lags in the residuals, terminated by a semicolon. In the 442above example, the error term is specified as 443 444 u(t) = rho(1)*u(t-1) + rho(3)*u(t-3) + rho(4)*u(t-4) 445 446Menu path: /Model/Univariate time series/AR Errors (GLS) 447 448# ar1 Estimation 449 450Arguments: depvar indepvars 451Options: --hilu (use Hildreth-Lu procedure) 452 --pwe (use Prais-Winsten estimator) 453 --vcv (print covariance matrix) 454 --no-corc (do not fine-tune results with Cochrane-Orcutt) 455 --loose (use looser convergence criterion) 456 --quiet (don't print anything) 457Examples: ar1 1 0 2 4 6 7 458 ar1 y 0 xlist --pwe 459 ar1 y 0 xlist --hilu --no-corc 460 461Computes feasible GLS estimates for a model in which the error term is 462assumed to follow a first-order autoregressive process. 463 464The default method is the Cochrane-Orcutt iterative procedure; see for 465example section 9.4 of Ramanathan (2002). The criterion for convergence is 466that successive estimates of the autocorrelation coefficient do not differ 467by more than 1e-6, or if the --loose option is given, by more than 0.001. If 468this is not achieved within 100 iterations an error is flagged. 469 470If the --pwe option is given, the Prais-Winsten estimator is used. This 471involves an iteration similar to Cochrane-Orcutt; the difference is that 472while Cochrane-Orcutt discards the first observation, Prais-Winsten makes 473use of it. See, for example, Chapter 13 of Greene (2000) for details. 474 475If the --hilu option is given, the Hildreth-Lu search procedure is used. The 476results are then fine-tuned using the Cochrane-Orcutt method, unless the 477--no-corc flag is specified. The --no-corc option is ignored for estimators 478other than Hildreth-Lu. 479 480Menu path: /Model/Univariate time series/AR Errors (GLS) 481 482# arch Estimation 483 484Arguments: order depvar indepvars 485Option: --quiet (don't print anything) 486Example: arch 4 y 0 x1 x2 x3 487 488This command is retained at present for backward compatibility, but you are 489better off using the maximum likelihood estimator offered by the "garch" 490command; for a plain ARCH model, set the first GARCH parameter to 0. 491 492Estimates the given model specification allowing for ARCH (Autoregressive 493Conditional Heteroskedasticity). The model is first estimated via OLS, then 494an auxiliary regression is run, in which the squared residual from the first 495stage is regressed on its own lagged values. The final step is weighted 496least squares estimation, using as weights the reciprocals of the fitted 497error variances from the auxiliary regression. (If the predicted variance of 498any observation in the auxiliary regression is not positive, then the 499corresponding squared residual is used instead). 500 501The alpha values displayed below the coefficients are the estimated 502parameters of the ARCH process from the auxiliary regression. 503 504See also "garch" and "modtest" (the --arch option). 505 506# arima Estimation 507 508Arguments: p d q [ ; P D Q ] ; depvar [ indepvars ] 509Options: --verbose (print details of iterations) 510 --quiet (don't print out results) 511 --vcv (print covariance matrix) 512 --hessian (see below) 513 --opg (see below) 514 --nc (do not include a constant) 515 --conditional (use conditional maximum likelihood) 516 --x-12-arima (use X-12-ARIMA, or X13, for estimation) 517 --lbfgs (use L-BFGS-B maximizer) 518 --y-diff-only (ARIMAX special, see below) 519Examples: arima 1 0 2 ; y 520 arima 2 0 2 ; y 0 x1 x2 --verbose 521 arima 0 1 1 ; 0 1 1 ; y --nc 522 See also armaloop.inp, bjg.inp 523 524Note: arma is an acceptable alias for this command. 525 526If no indepvars list is given, estimates a univariate ARIMA (Autoregressive, 527Integrated, Moving Average) model. The values p, d and q represent the 528autoregressive (AR) order, the differencing order, and the moving average 529(MA) order respectively. These values may be given in numerical form, or as 530the names of pre-existing scalar variables. A d value of 1, for instance, 531means that the first difference of the dependent variable should be taken 532before estimating the ARMA parameters. 533 534If you wish to include only specific AR or MA lags in the model (as opposed 535to all lags up to a given order) you can substitute for p and/or q either 536(a) the name of a pre-defined matrix containing a set of integer values or 537(b) an expression such as {1,4}; that is, a set of lags separated by commas 538and enclosed in braces. 539 540The optional integer values P, D and Q represent the seasonal AR order, the 541order for seasonal differencing, and the seasonal MA order, respectively. 542These are applicable only if the data have a frequency greater than 1 (for 543example, quarterly or monthly data). These orders may be given in numerical 544form or as scalar variables. 545 546In the univariate case the default is to include an intercept in the model 547but this can be suppressed with the --nc flag. If indepvars are added, the 548model becomes ARMAX; in this case the constant should be included explicitly 549if you want an intercept (as in the second example above). 550 551An alternative form of syntax is available for this command: if you do not 552want to apply differencing (either seasonal or non-seasonal), you may omit 553the d and D fields altogether, rather than explicitly entering 0. In 554addition, arma is a synonym or alias for arima. Thus for example the 555following command is a valid way to specify an ARMA(2, 1) model: 556 557 arma 2 1 ; y 558 559The default is to use the "native" gretl ARMA functionality, with estimation 560by exact ML; estimation via conditional ML is available as an option. (If 561X-12-ARIMA is installed you have the option of using it instead of native 562code. Note that the newer X13 works as a drop-in replacement in exactly the 563same way.) For details regarding these options, please see chapter 31 of the 564Gretl User's Guide. 565 566When native exact ML code is used, estimated standard errors are by default 567based on a numerical approximation to the (negative inverse of) the Hessian, 568with a fallback to the outer product of the gradient (OPG) if calculation of 569the numerical Hessian should fail. Two (mutually exclusive) option flags can 570be used to force the issue: the --opg option forces use of the OPG method, 571with no attempt to compute the Hessian, while the --hessian flag disables 572the fallback to OPG. Note that failure of the numerical Hessian computation 573is generally an indicator of a misspecified model. 574 575The option --lbfgs is specific to estimation using native ARMA code and 576exact ML: it calls for use of the "limited memory" L-BFGS-B algorithm in 577place of the regular BFGS maximizer. This may help in some instances where 578convergence is difficult to achieve. 579 580The option --y-diff-only is specific to estimation of ARIMAX models (models 581with a non-zero order of integration and including exogenous regressors), 582and applies only when gretl's native exact ML is used. For such models the 583default behavior is to difference both the dependent variable and the 584regressors, but when this option is specified only the dependent variable is 585differenced, the regressors remaining in level form. 586 587The AIC value given in connection with ARIMA models is calculated according 588to the definition used in X-12-ARIMA, namely 589 590 AIC = -2L + 2k 591 592where L is the log-likelihood and k is the total number of parameters 593estimated. Note that X-12-ARIMA does not produce information criteria such 594as AIC when estimation is by conditional ML. 595 596The AR and MA roots shown in connection with ARMA estimation are based on 597the following representation of an ARMA(p, q) process: 598 599 (1 - a_1*L - a_2*L^2 - ... - a_p*L^p)Y = 600 c + (1 + b_1*L + b_2*L^2 + ... + b_q*L^q) e_t 601 602The AR roots are therefore the solutions to 603 604 1 - a_1*z - a_2*z^2 - ... - a_p*L^p = 0 605 606and stability requires that these roots lie outside the unit circle. 607 608The "frequency" figure printed in connection with AR and MA roots is the 609lambda value that solves z = r * exp(i*2*pi*lambda) where z is the root in 610question and r is its modulus. 611 612Menu path: /Model/Univariate time series/ARIMA 613 614# arma Estimation 615 616See "arima"; arma is an alias. 617 618# bds Tests 619 620Arguments: order x 621Options: --corr1=rho (see below) 622 --sdcrit=multiple (see below) 623 --boot=N (see below) 624 --matrix=m (use matrix input) 625 --quiet (suppress printing of results) 626Examples: bds 5 x 627 bds 3 --matrix=m 628 bds 4 --sdcrit=2.0 629 630Performs the BDS (Brock, Dechert, Scheinkman and LeBaron, 1996) test for 631nonlinearity of the series x. In an econometric context this is typically 632used to test a regression residual for violation of the IID condition. The 633test is based on a set of correlation integrals, designed to detect 634nonlinearity of progressively higher dimensionality, and the order argument 635sets the number of such integrals. This must be at least 2; the first 636integral establishes a baseline but does not support a test. The BDS test is 637of the portmanteau type: able to detect all manner of departures from 638linearity but not informative about how exactly the condition was violated. 639 640Instead of giving x as a series, the --matrix option can be used to specify 641a matrix as input. The matrix must be a vector (column or row). 642 643Criterion for closeness 644 645The correlation integrals are based on a measure of "closeness" of data 646points, where two points are considered close if they lie within ε of each 647other. The test requires a specification of ε. By default gretl follows the 648recommendation of Kanzler (1999): ε is chosen such that the first-order 649correlation integral is around 0.7. A common alternative (requiring less 650computation) is to specify ε as a multiple of the standard deviation of the 651target series. The --sdcrit option supports the latter method; in the third 652example above ε is set to twice the standard deviation of x. The --corr1 653option implies use of Kanzler's method but allows for a target correlation 654other than 0.7. It should be clear that these two options are mutually 655exclusive. 656 657Bootstrapping 658 659BDS test statistics are asymptotically distributed as N(0,1) but the test 660over-rejects quite markedly in small to moderate-sized samples. For that 661reason P-values are by default obtained via bootstrapping when x is of 662length less than 600 (but by reference to the normal distribution 663otherwise). If you want to use the bootstrap for larger samples you can 664force the issue by giving a non-zero value for the --boot option, 665Conversely, if you don't want bootstrapping for smaller samples, give a zero 666value for --boot. 667 668When bootstrapping is performed the default number of iterations is 1999, 669but you can specify a different number by giving a value greater than 1 with 670--boot. 671 672Accessor matrix 673 674On successful completion of this command, "$result" retrieves the test 675results in the form of a matrix with two rows and order - 1 columns. The 676first row contains test statistics and the second P-values for each of the 677per-dimension tests under the null that x is linear/IID. 678 679# biprobit Estimation 680 681Arguments: depvar1 depvar2 indepvars1 [ ; indepvars2 ] 682Options: --vcv (print covariance matrix) 683 --robust (robust standard errors) 684 --cluster=clustvar (see "logit" for explanation) 685 --opg (see below) 686 --save-xbeta (see below) 687 --verbose (print extra information) 688Examples: biprobit y1 y2 0 x1 x2 689 biprobit y1 y2 0 x11 x12 ; 0 x21 x22 690 See also biprobit.inp 691 692Estimates a bivariate probit model, using the Newton-Raphson method to 693maximize the likelihood. 694 695The argument list starts with the two (binary) dependent variables, followed 696by a list of regressors. If a second list is given, separated by a 697semicolon, this is interpreted as a set of regressors specific to the second 698equation, with indepvars1 being specific to the first equation; otherwise 699indepvars1 is taken to represent a common set of regressors. 700 701By default, standard errors are computed using the analytical Hessian at 702convergence. But if the --opg option is given the covariance matrix is based 703on the Outer Product of the Gradient (OPG), or if the --robust option is 704given QML standard errors are calculated, using a "sandwich" of the inverse 705of the Hessian and the OPG. 706 707Note that the estimate of rho, the correlation of the error terms across the 708two equations, is included in the coefficient vector; it's the last element 709in the accessors coeff, stderr and vcv. 710 711After successful estimation, the accessor $uhat retrieves a matrix with two 712columns holding the generalized residuals for the two equations; that is, 713the expected values of the disturbances conditional on the observed outcomes 714and covariates. By default $yhat retrieves a matrix with four columns, 715holding the estimated probabilities of the four possible joint outcomes for 716(y_1, y_2), in the order (1,1), (1,0), (0,1), (0,0). Alternatively, if the 717option --save-xbeta is given, $yhat has two columns and holds the values of 718the index functions for the respective equations. 719 720The output includes a test of the null hypothesis that the disturbances in 721the two equations are uncorrelated. This is a likelihood ratio test unless 722the QML variance estimator is requested, in which case it's a Wald test. 723 724# bkw Tests 725 726Option: --quiet (don't print anything) 727Examples: longley.inp 728 729Must follow the estimation of a model which includes at least two 730independent variables. Calculates and displays diagnostic information 731pertaining to collinearity, namely the BKW Table, based on the work of 732Belsley, Kuh and Welsch (1980). This table presents a sophisticated analysis 733of the degree and sources of collinearity, via eigenanalysis of the inverse 734correlation matrix. For a thorough account of the BKW approach with 735reference to gretl, and with several examples, see Adkins, Waters and Hill 736(2015). 737 738Following this command the "$result" accessor may be used to retrieve the 739BKW table as a matrix. See also the "vif" command for a simpler approach to 740diagnosing collinearity. 741 742There is also a function named "bkw" which offers greater flexibility. 743 744Menu path: Model window, /Analysis/Collinearity 745 746# boxplot Graphs 747 748Argument: varlist 749Options: --notches (show 90 percent interval for median) 750 --factorized (see below) 751 --panel (see below) 752 --matrix=name (plot columns of named matrix) 753 --output=filename (send output to specified file) 754 755These plots display the distribution of a variable. The central box encloses 756the middle 50 percent of the data, i.e. it is bounded by the first and third 757quartiles. The "whiskers" extend from each end of the box for a range equal 758to 1.5 times the interquartile range. Observations outside that range are 759considered outliers and represented via dots. A line is drawn across the box 760at the median. A "+" sign is used to indicate the mean. If the option of 761showing a confidence interval for the median is selected, this is computed 762via the bootstrap method and shown in the form of dashed horizontal lines 763above and/or below the median. 764 765The --factorized option allows you to examine the distribution of a chosen 766variable conditional on the value of some discrete factor. For example, if a 767data set contains wages and a gender dummy variable you can select the wage 768variable as the target and gender as the factor, to see side-by-side 769boxplots of male and female wages, as in 770 771 boxplot wage gender --factorized 772 773Note that in this case you must specify exactly two variables, with the 774factor given second. 775 776If the current data set is a panel, and just one variable is specified, the 777--panel option produces a series of side-by-side boxplots, one for each 778panel "unit" or group. 779 780Generally, the argument varlist is required, and refers to one or more 781series in the current dataset (given either by name or ID number). But if a 782named matrix is supplied via the --matrix option this argument becomes 783optional: by default a plot is drawn for each column of the specified 784matrix. 785 786Gretl's boxplots are generated using gnuplot, and it is possible to specify 787the plot more fully by appending additional gnuplot commands, enclosed in 788braces. For details, please see the help for the "gnuplot" command. 789 790In interactive mode the result is displayed immediately. In batch mode the 791default behavior is that a gnuplot command file is written in the user's 792working directory, with a name on the pattern gpttmpN.plt, starting with N = 79301. The actual plots may be generated later using gnuplot (under MS Windows, 794wgnuplot). This behavior can be modified by use of the --output=filename 795option. For details, please see the "gnuplot" command. 796 797Menu path: /View/Graph specified vars/Boxplots 798 799# break Programming 800 801Break out of a loop. This command can be used only within a loop; it causes 802command execution to break out of the current (innermost) loop. See also 803"loop". 804 805# catch Programming 806 807Syntax: catch command 808 809This is not a command in its own right but can be used as a prefix to most 810regular commands: the effect is to prevent termination of a script if an 811error occurs in executing the command. If an error does occur, this is 812registered in an internal error code which can be accessed as $error (a zero 813value indicates success). The value of $error should always be checked 814immediately after using catch, and appropriate action taken if the command 815failed. 816 817The catch keyword cannot be used before if, elif or endif. In addition it 818should not be used on calls to user-defined functions; it is intended for 819use only with gretl commands and calls to "built-in" functions or operators. 820Furthermore, catch cannot be used in conjunction with "back-arrow" 821assignment of models or plots to session icons (see chapter 3 of the Gretl 822User's Guide). 823 824# chow Tests 825 826Variants: chow obs 827 chow dummyvar --dummy 828Options: --dummy (use a pre-existing dummy variable) 829 --quiet (don't print estimates for augmented model) 830 --limit-to=list (limit test to subset of regressors) 831Examples: chow 25 832 chow 1988:1 833 chow female --dummy 834 835Must follow an OLS regression. If an observation number or date is given, 836provides a test for the null hypothesis of no structural break at the given 837split point. The procedure is to create a dummy variable which equals 1 from 838the split point specified by obs to the end of the sample, 0 otherwise, and 839also interaction terms between this dummy and the original regressors. If a 840dummy variable is given, tests the null hypothesis of structural homogeneity 841with respect to that dummy. Again, interaction terms are added. In either 842case an augmented regression is run including the additional terms. 843 844By default an F statistic is calculated, taking the augmented regression as 845the unrestricted model and the original as the restricted. But if the 846original model used a robust estimator for the covariance matrix, the test 847statistic is a Wald chi-square value based on a robust estimator of the 848covariance matrix for the augmented regression. 849 850The --limit-to option can be used to limit the set of interactions with the 851split dummy variable to a subset of the original regressors. The parameter 852for this option must be a named list, all of whose members are among the 853original regressors. The list should not include the constant. 854 855Menu path: Model window, /Tests/Chow test 856 857# clear Programming 858 859Options: --dataset (clear dataset only) 860 --functions (clear functions (only)) 861 862By default this command clears the current dataset (if any) plus all saved 863variables (scalars, matrices, etc.) out of memory. Note that opening a new 864dataset, or using the "nulldata" command to create an empty dataset, also 865has this effect, so explicit use of "clear" is not usually necessary. 866 867If the --dataset option is given, then only the dataset is cleared (plus any 868named lists of series); other saved objects such as matrices, scalars and 869bundles are preserved. 870 871If the --functions option is given, then any user-defined functions, and any 872functions defined by packages that have been loaded, are cleared out of 873memory. The dataset and other variables are not affected. 874 875# coeffsum Tests 876 877Argument: varlist 878Option: --quiet (don't print anything) 879Examples: coeffsum xt xt_1 xr_2 880 See also restrict.inp 881 882Must follow a regression. Calculates the sum of the coefficients on the 883variables in varlist. Prints this sum along with its standard error and the 884p-value for the null hypothesis that the sum is zero. 885 886Note the difference between this and "omit", which tests the null hypothesis 887that the coefficients on a specified subset of independent variables are all 888equal to zero. 889 890The --quiet option may be useful if one just wants access to the "$test" and 891"$pvalue" values that are recorded on successful completion. 892 893Menu path: Model window, /Tests/Sum of coefficients 894 895# coint Tests 896 897Arguments: order depvar indepvars 898Options: --nc (do not include a constant) 899 --ct (include constant and trend) 900 --ctt (include constant and quadratic trend) 901 --seasonals (include seasonal dummy variables) 902 --skip-df (no DF tests on individual variables) 903 --test-down[=criterion] (automatic lag order) 904 --verbose (print extra details of regressions) 905 --silent (don't print anything) 906Examples: coint 4 y x1 x2 907 coint 0 y x1 x2 --ct --skip-df 908 909The Engle-Granger (1987) cointegration test. The default procedure is: (1) 910carry out Dickey-Fuller tests on the null hypothesis that each of the 911variables listed has a unit root; (2) estimate the cointegrating regression; 912and (3) run a DF test on the residuals from the cointegrating regression. If 913the --skip-df flag is given, step (1) is omitted. 914 915If the specified lag order is positive all the Dickey-Fuller tests use that 916order, with this qualification: if the --test-down option is given, the 917given value is taken as the maximum and the actual lag order used in each 918case is obtained by testing down. See the "adf" command for details of this 919procedure. 920 921By default, the cointegrating regression contains a constant. If you wish to 922suppress the constant, add the --nc flag. If you wish to augment the list of 923deterministic terms in the cointegrating regression with a linear or 924quadratic trend, add the --ct or --ctt flag. These option flags are mutually 925exclusive. You also have the option of adding seasonal dummy variables (in 926the case of quarterly or monthly data). 927 928P-values for this test are based on MacKinnon (1996). The relevant code is 929included by kind permission of the author. 930 931For the cointegration tests due to Søren Johansen, see "johansen". 932 933Menu path: /Model/Multivariate time series 934 935# corr Statistics 936 937Variants: corr [ varlist ] 938 corr --matrix=matname 939Options: --uniform (ensure uniform sample) 940 --spearman (Spearman's rho) 941 --kendall (Kendall's tau) 942 --verbose (print rankings) 943 --plot=mode-or-filename (see below) 944 --triangle (only plot lower half, see below) 945Examples: corr y x1 x2 x3 946 corr ylist --uniform 947 corr x y --spearman 948 corr --matrix=X --plot=display 949 950By default, prints the pairwise correlation coefficients (Pearson's 951product-moment correlation) for the variables in varlist, or for all 952variables in the data set if varlist is not given. The standard behavior is 953to use all available observations for computing each pairwise coefficient, 954but if the --uniform option is given the sample is limited (if necessary) so 955that the same set of observations is used for all the coefficients. This 956option has an effect only if there are differing numbers of missing values 957for the variables used. 958 959The (mutually exclusive) options --spearman and --kendall produce, 960respectively, Spearman's rank correlation rho and Kendall's rank correlation 961tau in place of the default Pearson coefficient. When either of these 962options is given, varlist should contain just two variables. 963 964When a rank correlation is computed, the --verbose option can be used to 965print the original and ranked data (otherwise this option is ignored). 966 967If varlist contains more than two series and the program is not in batch 968mode, a "heatmap" plot of the correlation matrix is shown. This can be 969adjusted via the --plot option. The acceptable parameters to this option are 970none (to suppress the plot); display (to display a plot even when in batch 971mode); or a file name. The effect of providing a file name is as described 972for the --output option of the "gnuplot" command. When plotting is active 973the option --triangle can be used to show only the lower triangle of the 974matrix plot. 975 976If the alternative form is given, using a named matrix rather than a list of 977series, the --spearman and --kendall options are not available -- but see 978the "npcorr" function. 979 980The "$result" accessor can be used to obtain the correlations as a matrix. 981 982Menu path: /View/Correlation matrix 983Other access: Main window pop-up menu (multiple selection) 984 985# corrgm Statistics 986 987Arguments: series [ order ] 988Options: --bartlett (use Bartlett standard errors) 989 --plot=mode-or-filename (see below) 990 --quiet (suppress the plot) 991Example: corrgm x 12 992 993Prints the values of the autocorrelation function (ACF) for series, which 994may be specified by name or number. The values are defined as rho(u_t, 995u_t-s) where u_t is the t^th observation of the variable u and s denotes the 996number of lags. 997 998The partial autocorrelations (PACF, calculated using the Durbin-Levinson 999algorithm) are also shown: these are net of the effects of intervening lags. 1000In addition the Ljung-Box Q statistic is printed. This may be used to test 1001the null hypothesis that the series is "white noise"; it is asymptotically 1002distributed as chi-square with degrees of freedom equal to the number of 1003lags used. 1004 1005Asterisks are used to indicate statistical significance of the individual 1006autocorrelations. By default this is assessed using a standard error of one 1007over the square root of the sample size, but if the --bartlett option is 1008given then Bartlett standard errors are used for the ACF. This option also 1009governs the confidence band drawn in the ACF plot, if applicable. 1010 1011If an order value is specified the length of the correlogram is limited to 1012at most that number of lags, otherwise the length is determined 1013automatically, as a function of the frequency of the data and the number of 1014observations. 1015 1016By default, a plot of the correlogram is produced: a gnuplot graph in 1017interactive mode or an ASCII graphic in batch mode. This can be adjusted via 1018the --plot option. The acceptable parameters to this option are none (to 1019suppress the plot); ascii (to produce a text graphic even when in 1020interactive mode); display (to produce a gnuplot graph even when in batch 1021mode); or a file name. The effect of providing a file name is as described 1022for the --output option of the "gnuplot" command. 1023 1024Upon successful completion, the accessors "$test" and "$pvalue" contain the 1025corresponding figures of the Ljung-Box test for the maximum order displayed. 1026Note that if you just want to compute the Q statistic, you'll probably want 1027to use the "ljungbox" function instead. 1028 1029Menu path: /Variable/Correlogram 1030Other access: Main window pop-up menu (single selection) 1031 1032# cusum Tests 1033 1034Options: --squares (perform the CUSUMSQ test) 1035 --quiet (just print the Harvey-Collier test) 1036 --plot=mode-or-filename (see below) 1037 1038Must follow the estimation of a model via OLS. Performs the CUSUM test -- or 1039if the --squares option is given, the CUSUMSQ test -- for parameter 1040stability. A series of one-step ahead forecast errors is obtained by running 1041a series of regressions: the first regression uses the first k observations 1042and is used to generate a prediction of the dependent variable at 1043observation k + 1; the second uses the first k + 1 observations and 1044generates a prediction for observation k + 2, and so on (where k is the 1045number of parameters in the original model). 1046 1047The cumulated sum of the scaled forecast errors, or the squares of these 1048errors, is printed. The null hypothesis of parameter stability is rejected 1049at the 5 percent significance level if the cumulated sum strays outside of 1050the 95 percent confidence band. 1051 1052In the case of the CUSUM test, the Harvey-Collier t-statistic for testing 1053the null hypothesis of parameter stability is also printed. See Greene's 1054Econometric Analysis for details. For the CUSUMSQ test, the 95 percent 1055confidence band is calculated using the algorithm given in Edgerton and 1056Wells (1994). 1057 1058By default, if the program is not in batch mode a plot of the cumulated 1059series and confidence band is shown. This can be adjusted via the --plot 1060option. The acceptable parameters to this option are none (to suppress the 1061plot); display (to display a plot even when in batch mode); or a file name. 1062The effect of providing a file name is as described for the --output option 1063of the "gnuplot" command. 1064 1065Menu path: Model window, /Tests/CUSUM(SQ) 1066 1067# data Dataset 1068 1069Argument: varlist 1070Options: --compact=method (specify compaction method) 1071 --quiet (don't report results except on error) 1072 --name=identifier (rename imported series) 1073 --odbc (import from ODBC database) 1074 --no-align (ODBC-specific, see below) 1075 1076Reads the variables in varlist from a database file (native gretl, RATS 4.0 1077or PcGive), which must have been opened previously using the "open" command. 1078The data command can also be used to import series from DB.NOMICS or from an 1079ODBC database; for details on those variants see gretl + DB.NOMICS or 1080chapter 42 of the Gretl User's Guide, respectively. 1081 1082The data frequency and sample range may be established via the "setobs" and 1083"smpl" commands prior to using this command. Here's an example: 1084 1085 open fedstl.bin 1086 setobs 12 2000:01 1087 smpl ; 2019:12 1088 data unrate cpiaucsl 1089 1090The commands above open the database named fedstl.bin (which is supplied 1091with gretl), establish a monthly dataset starting in January 2000 and ending 1092in December of 2019, and then import the series named unrate (unemployment 1093rate) and cpiaucsl (all-items CPI). 1094 1095If setobs and smpl are not specified in this way, the data frequency and 1096sample range are set using the first variable read from the database. 1097 1098If the series to be read are of higher frequency than the working dataset, 1099you may specify a compaction method as below: 1100 1101 data LHUR PUNEW --compact=average 1102 1103The five available compaction methods are "average" (takes the mean of the 1104high frequency observations), "last" (uses the last observation), "first", 1105"sum" and "spread". If no method is specified, the default is to use the 1106average. The "spread" method is special: no information is lost, rather it 1107is spread across multiple series, one per sub-period. So for example when 1108adding a monthly series to a quarterly dataset three series are created, one 1109for each month of the quarter; their names bear the suffixes m01, m02 and 1110m03. 1111 1112If the series to be read are of lower frequency than the working dataset the 1113values of the added data are simply repeated as required, but note that the 1114"tdisagg" function can then be used to distribution or interpolation 1115("temporal disaggregation"). 1116 1117In the case of native gretl databases (only), the "glob" characters * and ? 1118can be used in varlist to import series that match the given pattern. For 1119example, the following will import all series in the database whose names 1120begin with cpi: 1121 1122 data cpi* 1123 1124The --name option can be used to set a name for the imported series other 1125than the original name in the database. The parameter must be a valid gretl 1126identifier. This option is restricted to the case where a single series is 1127specified for importation. 1128 1129The --no-align option applies only to importation of series via ODBC. By 1130default we require that the ODBC query returns information telling gretl on 1131which rows of the dataset to place the incoming data -- or at least that the 1132number of incoming values matches either the length of the dataset or the 1133length of the current sample range. Setting the --no-align option relaxes 1134this requirement: failing the conditions just mentioned, incoming values are 1135simply placed consecutively starting at the first row of the dataset. If 1136there are fewer such values than rows in the dataset the trailing rows are 1137filled with NAs; if there are more such values than rows the extra values 1138are discarded. For more on ODBC importation see chapter 42 of the Gretl 1139User's Guide. 1140 1141Menu path: /File/Databases 1142 1143# dataset Dataset 1144 1145Arguments: keyword parameters 1146Option: --panel-time (see addobs below) 1147Examples: dataset addobs 24 1148 dataset addobs 2 --panel-time 1149 dataset insobs 10 1150 dataset compact 1 1151 dataset compact 4 last 1152 dataset expand 1153 dataset transpose 1154 dataset sortby x1 1155 dataset resample 500 1156 dataset renumber x 4 1157 dataset pad-daily 7 1158 dataset clear 1159 1160Performs various operations on the data set as a whole, depending on the 1161given keyword, which must be addobs, insobs, clear, compact, expand, 1162transpose, sortby, dsortby, resample, renumber or pad-daily. Note: with the 1163exception of clear, these actions are not available when the dataset is 1164currently subsampled by selection of cases on some Boolean criterion. 1165 1166addobs: Must be followed by a positive integer, call it n. Adds n extra 1167observations to the end of the working dataset. This is primarily intended 1168for forecasting purposes. The values of most variables over the additional 1169range will be set to missing, but certain deterministic variables are 1170recognized and extended, namely, a simple linear trend and periodic dummy 1171variables. If the dataset takes the form of a panel, the --panel-time flag 1172can be used to lengthen the time series for each cross-sectional unit (the 1173default action being to add n such units). 1174 1175insobs: Must be followed by a positive integer no greater than the current 1176number of observations. Inserts a single observation at the specified 1177position. All subsequent data are shifted by one place and the dataset is 1178extended by one observation. All variables apart from the constant are given 1179missing values at the new observation. This action is not available for 1180panel datasets. 1181 1182clear: No parameter required. Clears out the current data, returning gretl 1183to its initial "empty" state. 1184 1185compact: Must be followed by a positive integer representing a new data 1186frequency, which should be lower than the current frequency (for example, a 1187value of 4 when the current frequency is 12 indicates compaction from 1188monthly to quarterly). This command is available for time series data only; 1189it compacts all the series in the data set to the new frequency. A second 1190parameter may be given, namely one of sum, first, last or spread, to 1191specify, respectively, compaction using the sum of the higher-frequency 1192values, start-of-period values, end-of-period values, or spreading of the 1193higher-frequency values across multiple series (one per sub-period). The 1194default is to compact by averaging. 1195 1196expand: This command is only available for annual or quarterly time series 1197data: annual data can be expanded to quarterly or monthly, and quarterly 1198data to monthly. All series in the data set are padded out to the new 1199frequency by repeating the existing values. If the original dataset is 1200annual the default expansion is to quarterly but expand can be followed by 120112 to request monthly. 1202 1203transpose: No additional parameter required. Transposes the current data 1204set. That is, each observation (row) in the current data set will be treated 1205as a variable (column), and each variable as an observation. This command 1206may be useful if data have been read from some external source in which the 1207rows of the data table represent variables. 1208 1209sortby: The name of a single series or list is required. If one series is 1210given, the observations on all variables in the dataset are re-ordered by 1211increasing value of the specified series. If a list is given, the sort 1212proceeds hierarchically: if the observations are tied in sort order with 1213respect to the first key variable then the second key is used to break the 1214tie, and so on until the tie is broken or the keys are exhausted. Note that 1215this command is available only for undated data. 1216 1217dsortby: Works as sortby except that the re-ordering is by decreasing value 1218of the key series. 1219 1220resample: Constructs a new dataset by random sampling, with replacement, of 1221the rows of the current dataset. One argument is required, namely the number 1222of rows to include. This may be less than, equal to, or greater than the 1223number of observations in the original data. The original dataset can be 1224retrieved via the command smpl full. 1225 1226renumber: Requires the name of an existing series followed by an integer 1227between 1 and the number of series in the dataset minus one. Moves the 1228specified series to the specified position in the dataset, renumbering the 1229other series accordingly. (Position 0 is occupied by the constant, which 1230cannot be moved.) 1231 1232pad-daily: Valid only if the current dataset contains dated daily data with 1233an incomplete calendar. The effect is to pad the data out to a complete 1234calendar by inserting blank rows (that is, rows containing nothing but NAs). 1235This option requires an integer parameter, namely the number of days per 1236week, which must be 5, 6 or 7, and must be greater than or equal to the 1237current data frequency. On successful completion, the data calendar will be 1238"complete" relative to this value. For example if days-per-week is 5 then 1239all weekdays will be represented, whether or not any data are available for 1240those days. 1241 1242Menu path: /Data 1243 1244# delete Dataset 1245 1246Variants: delete varlist 1247 delete varname 1248 delete --type=type-name 1249 delete pkgname 1250Options: --db (delete series from database) 1251 --force (see below) 1252 1253This command is an all-purpose destructor. It should be used with caution; 1254no confirmation is asked. 1255 1256In the first form above, varlist is a list of series, given by name or ID 1257number. Note that when you delete series any series with higher ID numbers 1258than those on the deletion list will be re-numbered. If the --db option is 1259given, this command deletes the listed series not from the current dataset 1260but from a gretl database, assuming that a database has been opened, and the 1261user has write permission for file in question. See also the "open" command. 1262 1263In the second form, the name of a scalar, matrix, string or bundle may be 1264given for deletion. The --db option is not applicable in this case. Note 1265that series and variables of other types should not be mixed in a given call 1266to delete. 1267 1268In the third form, the --type option must be accompanied by one of the 1269following type-names: matrix, bundle, string, list, scalar or array. The 1270effect is to delete all variables of the given type. In this case no 1271argument other than the option should be given. 1272 1273The fourth form can be used to unload a function package. In this case the 1274.gfn suffix must be supplied, as in 1275 1276 delete somepkg.gfn 1277 1278Note that this does not delete the package file, it just unloads the package 1279from memory. 1280 1281Deleting variables in a loop 1282 1283In general it is not permitted to delete variables in the context of a loop, 1284since this may threaten the integrity of the loop code. However, if you are 1285confident that deleting a certain variable is safe you can override this 1286prohibition by appending the --force flag to the delete command. 1287 1288Menu path: Main window pop-up (single selection) 1289 1290# diff Transformations 1291 1292Argument: varlist 1293Examples: penngrow.inp, sw_ch12.inp, sw_ch14.inp 1294 1295The first difference of each variable in varlist is obtained and the result 1296stored in a new variable with the prefix d_. Thus "diff x y" creates the new 1297variables 1298 1299 d_x = x(t) - x(t-1) 1300 d_y = y(t) - y(t-1) 1301 1302Menu path: /Add/First differences of selected variables 1303 1304# difftest Tests 1305 1306Arguments: series1 series2 1307Options: --sign (Sign test, the default) 1308 --rank-sum (Wilcoxon rank-sum test) 1309 --signed-rank (Wilcoxon signed-rank test) 1310 --verbose (print extra output) 1311 --quiet (suppress printed output) 1312Examples: ooballot.inp 1313 1314Carries out a nonparametric test for a difference between two populations or 1315groups, the specific test depending on the option selected. 1316 1317With the --sign option, the Sign test is performed. This test is based on 1318the fact that if two samples, x and y, are drawn randomly from the same 1319distribution, the probability that x_i > y_i, for each observation i, should 1320equal 0.5. The test statistic is w, the number of observations for which x_i 1321> y_i. Under the null hypothesis this follows the Binomial distribution with 1322parameters (n, 0.5), where n is the number of observations. 1323 1324With the --rank-sum option, the Wilcoxon rank-sum test is performed. This 1325test proceeds by ranking the observations from both samples jointly, from 1326smallest to largest, then finding the sum of the ranks of the observations 1327from one of the samples. The two samples do not have to be of the same size, 1328and if they differ the smaller sample is used in calculating the rank-sum. 1329Under the null hypothesis that the samples are drawn from populations with 1330the same median, the probability distribution of the rank-sum can be 1331computed for any given sample sizes; and for reasonably large samples a 1332close Normal approximation exists. 1333 1334With the --signed-rank option, the Wilcoxon signed-rank test is performed. 1335This is designed for matched data pairs such as, for example, the values of 1336a variable for a sample of individuals before and after some treatment. The 1337test proceeds by finding the differences between the paired observations, 1338x_i - y_i, ranking these differences by absolute value, then assigning to 1339each pair a signed rank, the sign agreeing with the sign of the difference. 1340One then calculates W_+, the sum of the positive signed ranks. As with the 1341rank-sum test, this statistic has a well-defined distribution under the null 1342that the median difference is zero, which converges to the Normal for 1343samples of reasonable size. 1344 1345For the Wilcoxon tests, if the --verbose option is given then the ranking is 1346printed. (This option has no effect if the Sign test is selected.) 1347 1348On successful completion the accessors "$test" and "$pvalue" are available. 1349If one just wants to obtain these values the --quiet flag can be appended to 1350the command. 1351 1352# discrete Transformations 1353 1354Argument: varlist 1355Option: --reverse (mark variables as continuous) 1356Examples: ooballot.inp, oprobit.inp 1357 1358Marks each variable in varlist as being discrete. By default all variables 1359are treated as continuous; marking a variable as discrete affects the way 1360the variable is handled in frequency plots, and also allows you to select 1361the variable for the command "dummify". 1362 1363If the --reverse flag is given, the operation is reversed; that is, the 1364variables in varlist are marked as being continuous. 1365 1366Menu path: /Variable/Edit attributes 1367 1368# dpanel Estimation 1369 1370Argument: p ; depvar indepvars [ ; instruments ] 1371Options: --quiet (don't show estimated model) 1372 --vcv (print covariance matrix) 1373 --two-step (perform 2-step GMM estimation) 1374 --system (add equations in levels) 1375 --time-dummies (add time dummy variables) 1376 --dpdstyle (emulate DPD package for Ox) 1377 --asymptotic (uncorrected asymptotic standard errors) 1378 --keep-extra (see below) 1379Examples: dpanel 2 ; y x1 x2 1380 dpanel 2 ; y x1 x2 --system 1381 dpanel {2 3} ; y x1 x2 ; x1 1382 dpanel 1 ; y x1 x2 ; x1 GMM(x2,2,3) 1383 See also bbond98.inp 1384 1385Carries out estimation of dynamic panel data models (that is, panel models 1386including one or more lags of the dependent variable) using either the 1387GMM-DIF or GMM-SYS method. 1388 1389The parameter p represents the order of the autoregression for the dependent 1390variable. In the simplest case this is a scalar value, but a pre-defined 1391matrix may be given for this argument, to specify a set of (possibly 1392non-contiguous) lags to be used. 1393 1394The dependent variable and regressors should be given in levels form; they 1395will be differenced automatically (since this estimator uses differencing to 1396cancel out the individual effects). 1397 1398The last (optional) field in the command is for specifying instruments. If 1399no instruments are given, it is assumed that all the independent variables 1400are strictly exogenous. If you specify any instruments, you should include 1401in the list any strictly exogenous independent variables. For predetermined 1402regressors, you can use the GMM function to include a specified range of 1403lags in block-diagonal fashion. This is illustrated in the third example 1404above. The first argument to GMM is the name of the variable in question, 1405the second is the minimum lag to be used as an instrument, and the third is 1406the maximum lag. The same syntax can be used with the GMMlevel function to 1407specify GMM-type instruments for the equations in levels. 1408 1409By default the results of 1-step estimation are reported (with robust 1410standard errors). You may select 2-step estimation as an option. In both 1411cases tests for autocorrelation of orders 1 and 2 are provided, as well as 1412the Sargan overidentification test and a Wald test for the joint 1413significance of the regressors. Note that in this differenced model 1414first-order autocorrelation is not a threat to the validity of the model, 1415but second-order autocorrelation violates the maintained statistical 1416assumptions. 1417 1418In the case of 2-step estimation, standard errors are by default computed 1419using the finite-sample correction suggested by Windmeijer (2005). The 1420standard asymptotic standard errors associated with the 2-step estimator are 1421generally reckoned to be an unreliable guide to inference, but if for some 1422reason you want to see them you can use the --asymptotic option to turn off 1423the Windmeijer correction. 1424 1425If the --time-dummies option is given, a set of time dummy variables is 1426added to the specified regressors. The number of dummies is one less than 1427the maximum number of periods used in estimation, to avoid perfect 1428collinearity with the constant. The dummies are entered in differenced form 1429unless the --dpdstyle option is given, in which case they are entered in 1430levels. 1431 1432As with other estimation commands, a "$model" bundle is available after 1433estimation. In the case of dpanel, the --keep-extra option can be used to 1434save additional information in this bundle, namely the GMM weight and 1435instrument matrices. 1436 1437For further details and examples, please see chapter 24 of the Gretl User's 1438Guide. 1439 1440Menu path: /Model/Panel/Dynamic panel model 1441 1442# dummify Transformations 1443 1444Argument: varlist 1445Options: --drop-first (omit lowest value from encoding) 1446 --drop-last (omit highest value from encoding) 1447 1448For any suitable variables in varlist, creates a set of dummy variables 1449coding for the distinct values of that variable. Suitable variables are 1450those that have been explicitly marked as discrete, or those that take on a 1451fairly small number of values all of which are "fairly round" (multiples of 14520.25). 1453 1454By default a dummy variable is added for each distinct value of the variable 1455in question. For example if a discrete variable x has 5 distinct values, 5 1456dummy variables will be added to the data set, with names Dx_1, Dx_2 and so 1457on. The first dummy variable will have value 1 for observations where x 1458takes on its smallest value, 0 otherwise; the next dummy will have value 1 1459when x takes on its second-smallest value, and so on. If one of the option 1460flags --drop-first or --drop-last is added, then either the lowest or the 1461highest value of each variable is omitted from the encoding (which may be 1462useful for avoiding the "dummy variable trap"). 1463 1464This command can also be embedded in the context of a regression 1465specification. For example, the following line specifies a model where y is 1466regressed on the set of dummy variables coding for x. (Option flags cannot 1467be passed to "dummify" in this context.) 1468 1469 ols y dummify(x) 1470 1471Other access: Main window pop-up menu (single selection) 1472 1473# duration Estimation 1474 1475Arguments: depvar indepvars [ ; censvar ] 1476Options: --exponential (use exponential distribution) 1477 --loglogistic (use log-logistic distribution) 1478 --lognormal (use log-normal distribution) 1479 --medians (fitted values are medians) 1480 --robust (robust (QML) standard errors) 1481 --cluster=clustvar (see "logit" for explanation) 1482 --vcv (print covariance matrix) 1483 --verbose (print details of iterations) 1484 --quiet (don't print anything) 1485Examples: duration y 0 x1 x2 1486 duration y 0 x1 x2 ; cens 1487 See also weibull.inp 1488 1489Estimates a duration model: the dependent variable (which must be positive) 1490represents the duration of some state of affairs, for example the length of 1491spells of unemployment for a cross-section of respondents. By default the 1492Weibull distribution is used but the exponential, log-logistic and 1493log-normal distributions are also available. 1494 1495If some of the duration measurements are right-censored (e.g. an 1496individual's spell of unemployment has not come to an end within the period 1497of observation) then you should supply the trailing argument censvar, a 1498series in which non-zero values indicate right-censored cases. 1499 1500By default the fitted values obtained via the accessor $yhat are the 1501conditional means of the durations, but if the --medians option is given 1502then $yhat provides the conditional medians instead. 1503 1504Please see chapter 38 of the Gretl User's Guide for details. 1505 1506Menu path: /Model/Limited dependent variable/Duration data 1507 1508# elif Programming 1509 1510See "if". 1511 1512# else Programming 1513 1514See "if". Note that "else" requires a line to itself, before the following 1515conditional command. You can append a comment, as in 1516 1517 else # OK, do something different 1518 1519But you cannot append a command, as in 1520 1521 else x = 5 # wrong! 1522 1523# end Programming 1524 1525Ends a block of commands of some sort. For example, "end system" terminates 1526an equation "system". 1527 1528# endif Programming 1529 1530See "if". 1531 1532# endloop Programming 1533 1534Marks the end of a command loop. See "loop". 1535 1536# eqnprint Printing 1537 1538Options: --complete (Create a complete document) 1539 --output=filename (send output to specified file) 1540 1541Must follow the estimation of a model. Prints the estimated model in the 1542form of a LaTeX equation. If a filename is specified using the --output 1543option output goes to that file, otherwise it goes to a file with a name of 1544the form equation_N.tex, where N is the number of models estimated to date 1545in the current session. See also "tabprint". 1546 1547The output file will be written in the currently set "workdir", unless the 1548filename string contains a full path specification. 1549 1550If the --complete flag is given, the LaTeX file is a complete document, 1551ready for processing; otherwise it must be included in a document. 1552 1553Menu path: Model window, /LaTeX 1554 1555# equation Estimation 1556 1557Arguments: depvar indepvars 1558Example: equation y x1 x2 x3 const 1559 1560Specifies an equation within a system of equations (see "system"). The 1561syntax for specifying an equation within an SUR system is the same as that 1562for, e.g., "ols". For an equation within a Three-Stage Least Squares system 1563you may either (a) give an OLS-type equation specification and provide a 1564common list of instruments using the "instr" keyword (again, see "system"), 1565or (b) use the same equation syntax as for "tsls". 1566 1567# estimate Estimation 1568 1569Arguments: [ systemname ] [ estimator ] 1570Options: --iterate (iterate to convergence) 1571 --no-df-corr (no degrees of freedom correction) 1572 --geomean (see below) 1573 --quiet (don't print results) 1574 --verbose (print details of iterations) 1575Examples: estimate "Klein Model 1" method=fiml 1576 estimate Sys1 method=sur 1577 estimate Sys1 method=sur --iterate 1578 1579Calls for estimation of a system of equations, which must have been 1580previously defined using the "system" command. The name of the system should 1581be given first, surrounded by double quotes if the name contains spaces. The 1582estimator, which must be one of "ols", "tsls", "sur", "3sls", "fiml" or 1583"liml", is preceded by the string method=. These arguments are optional if 1584the system in question has already been estimated and occupies the place of 1585the "last model"; in that case the estimator defaults to the previously used 1586value. 1587 1588If the system in question has had a set of restrictions applied (see the 1589"restrict" command), estimation will be subject to the specified 1590restrictions. 1591 1592If the estimation method is "sur" or "3sls" and the --iterate flag is given, 1593the estimator will be iterated. In the case of SUR, if the procedure 1594converges the results are maximum likelihood estimates. Iteration of 1595three-stage least squares, however, does not in general converge on the 1596full-information maximum likelihood results. The --iterate flag is ignored 1597for other methods of estimation. 1598 1599If the equation-by-equation estimators "ols" or "tsls" are chosen, the 1600default is to apply a degrees of freedom correction when calculating 1601standard errors. This can be suppressed using the --no-df-corr flag. This 1602flag has no effect with the other estimators; no degrees of freedom 1603correction is applied in any case. 1604 1605By default, the formula used in calculating the elements of the 1606cross-equation covariance matrix is 1607 1608 sigma(i,j) = u(i)' * u(j) / T 1609 1610If the --geomean flag is given, a degrees of freedom correction is applied: 1611the formula is 1612 1613 sigma(i,j) = u(i)' * u(j) / sqrt((T - ki) * (T - kj)) 1614 1615where the ks denote the number of independent parameters in each equation. 1616 1617If the --verbose option is given and an iterative method is specified, 1618details of the iterations are printed. 1619 1620# eval Utilities 1621 1622Argument: expression 1623Examples: eval x 1624 eval inv(X'X) 1625 eval sqrt($pi) 1626 1627This command makes gretl act like a glorified calculator. The program 1628evaluates expression and prints its value. The argument may be the name of a 1629variable, or something more complicated. In any case, it should be an 1630expression which could stand as the right-hand side of an assignment 1631statement. 1632 1633In interactive use (for instance in the gretl console) an equals sign works 1634as shorthand for eval, as in 1635 1636 =sqrt(x) 1637 1638(with or without a space following "="). But this variant is not accepted in 1639scripting mode since it could easily mask coding errors. 1640 1641In most contexts "print" can be used in place of eval to much the same 1642effect. See also "printf" for the case where you wish to combine textual and 1643numerical output. 1644 1645# fcast Prediction 1646 1647Variants: fcast [startobs endobs] [vname] 1648 fcast [startobs endobs] steps-ahead [vname] --recursive 1649Options: --dynamic (create dynamic forecast) 1650 --static (create static forecast) 1651 --out-of-sample (generate post-sample forecast) 1652 --no-stats (don't print forecast statistics) 1653 --stats-only (only print forecast statistics) 1654 --quiet (don't print anything) 1655 --recursive (see below) 1656 --plot=filename (see below) 1657Examples: fcast 1997:1 2001:4 f1 1658 fcast fit2 1659 fcast 2004:1 2008:3 4 rfcast --recursive 1660 See also gdp_midas.inp 1661 1662Must follow an estimation command. Forecasts are generated for a certain 1663range of observations: if startobs and endobs are given, for that range (if 1664possible); otherwise if the --out-of-sample option is given, for 1665observations following the range over which the model was estimated; 1666otherwise over the currently defined sample range. If an out-of-sample 1667forecast is requested but no relevant observations are available, an error 1668is flagged. Depending on the nature of the model, standard errors may also 1669be generated; see below. Also see below for the special effect of the 1670--recursive option. 1671 1672If the last model estimated is a single equation, then the optional vname 1673argument has the following effect: the forecast values are not printed, but 1674are saved to the dataset under the given name. If the last model is a system 1675of equations, vname has a different effect, namely selecting a particular 1676endogenous variable for forecasting (the default being to produce forecasts 1677for all the endogenous variables). In the system case, or if vname is not 1678given, the forecast values can be retrieved using the accessor "$fcast", and 1679the standard errors, if available, via "$fcse". 1680 1681The choice between a static and a dynamic forecast applies only in the case 1682of dynamic models, with an autoregressive error process or including one or 1683more lagged values of the dependent variable as regressors. Static forecasts 1684are one step ahead, based on realized values from the previous period, while 1685dynamic forecasts employ the chain rule of forecasting. For example, if a 1686forecast for y in 2008 requires as input a value of y for 2007, a static 1687forecast is impossible without actual data for 2007. A dynamic forecast for 16882008 is possible if a prior forecast can be substituted for y in 2007. 1689 1690The default is to give a static forecast for any portion of the forecast 1691range that lies within the sample range over which the model was estimated, 1692and a dynamic forecast (if relevant) out of sample. The --dynamic option 1693requests a dynamic forecast from the earliest possible date, and the 1694--static option requests a static forecast even out of sample. 1695 1696The --recursive option is presently available only for single-equation 1697models estimated via OLS. When this option is given the forecasts are 1698recursive. That is, each forecast is generated from an estimate of the given 1699model using data from a fixed starting point (namely, the start of the 1700sample range for the original estimation) up to the forecast date minus k, 1701where k is the number of steps ahead, which must be given in the steps-ahead 1702argument. The forecasts are always dynamic if this is applicable. Note that 1703the steps-ahead argument should be given only in conjunction with the 1704--recursive option. 1705 1706The --plot option (available only in the case of single-equation estimation) 1707calls for a plot file to be produced, containing a graphical representation 1708of the forecast. The suffix of the filename argument to this option controls 1709the format of the plot: .eps for EPS, .pdf for PDF, .png for PNG, .plt for a 1710gnuplot command file. The dummy filename display can be used to force 1711display of the plot in a window. For example, 1712 1713 fcast --plot=fc.pdf 1714 1715will generate a graphic in PDF format. Absolute pathnames are respected, 1716otherwise files are written to the gretl working directory. 1717 1718The nature of the forecast standard errors (if available) depends on the 1719nature of the model and the forecast. For static linear models standard 1720errors are computed using the method outlined by Davidson and MacKinnon 1721(2004); they incorporate both uncertainty due to the error process and 1722parameter uncertainty (summarized in the covariance matrix of the parameter 1723estimates). For dynamic models, forecast standard errors are computed only 1724in the case of a dynamic forecast, and they do not incorporate parameter 1725uncertainty. For nonlinear models, forecast standard errors are not 1726presently available. 1727 1728Menu path: Model window, /Analysis/Forecasts 1729 1730# flush Programming 1731 1732This simple command (no arguments, no options) is intended for use in 1733time-consuming scripts that may be executed via the gretl GUI (it is ignored 1734by the command-line program), to give the user a visual indication that 1735things are moving along and gretl is not "frozen". 1736 1737Ordinarily if you launch a script in the GUI no output is shown until its 1738execution is completed, but the effect of invoking flush is as follows: 1739 1740 On the first invocation, gretl opens a window, displays the output so far, 1741 and appends the message "Processing...". 1742 1743 On subsequent invocations the text shown in the output window is updated, 1744 and a new "processing" message is appended. 1745 1746When execution of the script is completed any remaining output is 1747automatically flushed to the text window. 1748 1749Please note, there is no point in using flush in scripts that take less than 1750(say) 5 seconds to execute. Also note that this command should not be used 1751at a point in the script where there is no further output to be printed, as 1752the "processing" message will then be misleading to the user. 1753 1754The following illustrates the intended use of flush: 1755 1756 set echo off 1757 scalar n = 10 1758 loop i=1..n 1759 # do some time-consuming operation 1760 loop 100 --quiet 1761 a = mnormal(200,200) 1762 b = inv(a) 1763 endloop 1764 # print some results 1765 printf "Iteration %2d done\n", i 1766 if i < n 1767 flush 1768 endif 1769 endloop 1770 1771# foreign Programming 1772 1773Syntax: foreign language=lang 1774Options: --send-data[=list] (pre-load data; see below) 1775 --quiet (suppress output from foreign program) 1776 1777This command opens a special mode in which commands to be executed by 1778another program are accepted. You exit this mode with end foreign; at this 1779point the stacked commands are executed. 1780 1781At present the "foreign" programs supported in this way are GNU R 1782(language=R), Python, Julia, GNU Octave (language=Octave), Jurgen Doornik's 1783Ox and Stata. Language names are recognized on a case-insensitive basis. 1784 1785In connection with R, Octave and Stata the --send-data option has the effect 1786of making data from gretl's workspace available within the target program. 1787By default the entire dataset is sent, but you can limit the data to be sent 1788by giving the name of a predefined list of series. For example: 1789 1790 list Rlist = x1 x2 x3 1791 foreign language=R --send-data=Rlist 1792 1793See chapter 44 of the Gretl User's Guide for details and examples. 1794 1795# fractint Statistics 1796 1797Arguments: series [ order ] 1798Options: --gph (do Geweke and Porter-Hudak test) 1799 --all (do both tests) 1800 --quiet (don't print results) 1801 1802Tests the specified series for fractional integration ("long memory"). The 1803null hypothesis is that the integration order of the series is zero. By 1804default the local Whittle estimator (Robinson, 1995) is used but if the 1805--gph option is given the GPH test (Geweke and Porter-Hudak, 1983) is 1806performed instead. If the --all flag is given then the results of both tests 1807are printed. 1808 1809For details on this sort of test, see Phillips and Shimotsu (2004). 1810 1811If the optional order argument is not given the order for the test(s) is set 1812automatically as the lesser of T/2 and T^0.6. 1813 1814The estimated fractional integration orders and their standard errors are 1815available via the "$result" accessor. With the --all option, the Local 1816Whittle estimate will be in the first row and the GPH estimate in the second 1817one. 1818 1819The results of the test can be retrieved using the accessors "$test" and 1820"$pvalue". These values are based on the Local Whittle Estimator unless the 1821--gph option is given. 1822 1823Menu path: /Variable/Unit root tests/Fractional integration 1824 1825# freq Statistics 1826 1827Argument: var 1828Options: --nbins=n (specify number of bins) 1829 --min=minval (specify minimum, see below) 1830 --binwidth=width (specify bin width, see below) 1831 --normal (test for the normal distribution) 1832 --gamma (test for gamma distribution) 1833 --silent (don't print anything) 1834 --matrix=name (use column of named matrix) 1835 --plot=mode-or-filename (see below) 1836 --quiet (suppress the plot) 1837Examples: freq x 1838 freq x --normal 1839 freq x --nbins=5 1840 freq x --min=0 --binwidth=0.10 1841 1842With no options given, displays the frequency distribution for the series 1843var (given by name or number), with the number of bins and their size chosen 1844automatically. 1845 1846If the --matrix option is given, var (which must be an integer) is instead 1847interpreted as a 1-based index that selects a column from the named matrix. 1848If the matrix in question is in fact a column vector, the var argument may 1849be omitted. 1850 1851To control the presentation of the distribution you may specify either the 1852number of bins or the minimum value plus the width of the bins, as shown in 1853the last two examples above. The --min option sets the lower limit of the 1854left-most bin. 1855 1856If the --normal option is given, the Doornik-Hansen chi-square test for 1857normality is computed. If the --gamma option is given, the test for 1858normality is replaced by Locke's nonparametric test for the null hypothesis 1859that the variable follows the gamma distribution; see Locke (1976), Shapiro 1860and Chen (2001). Note that the parameterization of the gamma distribution 1861used in gretl is (shape, scale). 1862 1863By default, if the program is not in batch mode a plot of the distribution 1864is shown. This can be adjusted via the --plot option. The acceptable 1865parameters to this option are none (to suppress the plot); display (to 1866display a plot even when in batch mode); or a file name. The effect of 1867providing a file name is as described for the --output option of the 1868"gnuplot" command. 1869 1870The --silent flag suppresses the usual text output. This might be used in 1871conjunction with one or other of the distribution test options: the test 1872statistic and its p-value are recorded, and can be retrieved using the 1873accessors "$test" and "$pvalue". It might also be used along with the --plot 1874option if you just want a histogram and don't care to see the accompanying 1875text. 1876 1877Note that gretl does not have a function that matches this command, but it 1878is possible to use the "aggregate" function to achieve the same purpose. In 1879addition, the frequency distribution constructed by freq can be obtained in 1880matrix form via the "$result" accessor. 1881 1882Menu path: /Variable/Frequency distribution 1883 1884# funcerr Programming 1885 1886Argument: [ message ] 1887 1888Applicable only in the context of a user-defined function (see "function"). 1889Causes execution of the current function to terminate with an error 1890condition flagged. An exception is the special MPI mode for parallelized 1891program execution, where only the associated string is printed. 1892 1893The optional message argument can take the form of a string literal or the 1894name of a string variable; if present it is printed as part of the error 1895message shown to the caller of the function. 1896 1897See also the closely related function, "errorif". 1898 1899# function Programming 1900 1901Argument: fnname 1902 1903Opens a block of statements in which a function is defined. This block must 1904be closed with end function. (An exception is the case when a user-defined 1905function shall be deleted, which is achieved by the single command line 1906function foo delete for a function named "foo".) See chapter 14 of the Gretl 1907User's Guide for details. 1908 1909# garch Estimation 1910 1911Arguments: p q ; depvar [ indepvars ] 1912Options: --robust (robust standard errors) 1913 --verbose (print details of iterations) 1914 --quiet (don't print anything) 1915 --vcv (print covariance matrix) 1916 --nc (do not include a constant) 1917 --stdresid (standardize the residuals) 1918 --fcp (use Fiorentini, Calzolari, Panattoni algorithm) 1919 --arma-init (initial variance parameters from ARMA) 1920Examples: garch 1 1 ; y 1921 garch 1 1 ; y 0 x1 x2 --robust 1922 See also garch.inp, sw_ch14.inp 1923 1924Estimates a GARCH model (GARCH = Generalized Autoregressive Conditional 1925Heteroskedasticity), either a univariate model or, if indepvars are 1926specified, including the given exogenous variables. The integer values p and 1927q (which may be given in numerical form or as the names of pre-existing 1928scalar variables) represent the lag orders in the conditional variance 1929equation: 1930 1931 h(t) = a(0) + sum(i=1 to q) a(i)*u(t-i)^2 + sum(j=1 to p) b(j)*h(t-j) 1932 1933The parameter p therefore represents the Generalized (or "AR") order, while 1934q represents the regular ARCH (or "MA") order. If p is non-zero, q must also 1935be non-zero otherwise the model is unidentified. However, you can estimate a 1936regular ARCH model by setting q to a positive value and p to zero. The sum 1937of p and q must be no greater than 5. Note that a constant is automatically 1938included in the mean equation unless the --nc option is given. 1939 1940By default native gretl code is used in estimation of GARCH models, but you 1941also have the option of using the algorithm of Fiorentini, Calzolari and 1942Panattoni (1996). The former uses the BFGS maximizer while the latter uses 1943the information matrix to maximize the likelihood, with fine-tuning via the 1944Hessian. 1945 1946Several variant estimators of the covariance matrix are available with this 1947command. By default, the Hessian is used unless the --robust option is 1948given, in which case the QML (White) covariance matrix is used. Other 1949possibilities (e.g. the information matrix, or the Bollerslev-Wooldridge 1950estimator) can be specified using the "set" command. 1951 1952By default, the estimates of the variance parameters are initialized using 1953the unconditional error variance from initial OLS estimation for the 1954constant, and small positive values for the coefficients on the past values 1955of the squared error and the error variance. The flag --arma-init calls for 1956the starting values of these parameters to be set using an initial ARMA 1957model, exploiting the relationship between GARCH and ARMA set out in Chapter 195821 of Hamilton's Time Series Analysis. In some cases this may improve the 1959chances of convergence. 1960 1961The GARCH residuals and estimated conditional variance can be retrieved as 1962$uhat and $h respectively. For example, to get the conditional variance: 1963 1964 series ht = $h 1965 1966If the --stdresid option is given, the $uhat values are divided by the 1967square root of h_t. 1968 1969Menu path: /Model/Univariate time series/GARCH 1970 1971# genr Dataset 1972 1973Arguments: newvar = formula 1974 1975NOTE: this command has undergone numerous changes and enhancements since the 1976following help text was written, so for comprehensive and updated info on 1977this command you'll want to refer to chapter 10 of the Gretl User's Guide. 1978On the other hand, this help does not contain anything actually erroneous, 1979so take the following as "you have this, plus more". 1980 1981In the appropriate context, series, scalar, matrix, string and bundle are 1982synonyms for this command. 1983 1984Creates new variables, often via transformations of existing variables. See 1985also "diff", "logs", "lags", "ldiff", "sdiff" and "square" for shortcuts. In 1986the context of a genr formula, existing variables must be referenced by 1987name, not ID number. The formula should be a well-formed combination of 1988variable names, constants, operators and functions (described below). Note 1989that further details on some aspects of this command can be found in chapter 199010 of the Gretl User's Guide. 1991 1992A genr command may yield either a series or a scalar result. For example, 1993the formula x2 = x * 2 naturally yields a series if the variable x is a 1994series and a scalar if x is a scalar. The formulae x = 0 and mx = mean(x) 1995naturally return scalars. Under some circumstances you may want to have a 1996scalar result expanded into a series or vector. You can do this by using 1997series as an "alias" for the genr command. For example, series x = 0 1998produces a series all of whose values are set to 0. You can also use scalar 1999as an alias for genr. It is not possible to coerce a vector result into a 2000scalar, but use of this keyword indicates that the result should be a 2001scalar: if it is not, an error occurs. 2002 2003When a formula yields a series result, the range over which the result is 2004written to the target variable depends on the current sample setting. It is 2005possible, therefore, to define a series piecewise using the smpl command in 2006conjunction with genr. 2007 2008Supported arithmetical operators are, in order of precedence: ^ 2009(exponentiation); *, / and % (modulus or remainder); + and -. 2010 2011The available Boolean operators are (again, in order of precedence): ! 2012(negation), && (logical AND), || (logical OR), >, <, == (is equal to), >= 2013(greater than or equal), <= (less than or equal) and != (not equal). The 2014Boolean operators can be used in constructing dummy variables: for instance 2015(x > 10) returns 1 if x > 10, 0 otherwise. 2016 2017Built-in constants are pi and NA. The latter is the missing value code: you 2018can initialize a variable to the missing value with scalar x = NA. 2019 2020The genr command supports a wide range of mathematical and statistical 2021functions, including all the common ones plus several that are special to 2022econometrics. In addition it offers access to numerous internal variables 2023that are defined in the course of running regressions, doing hypothesis 2024tests, and so on. For a listing of functions and accessors, type "help 2025functions". 2026 2027Besides the operators and functions noted above there are some special uses 2028of "genr": 2029 2030 "genr time" creates a time trend variable (1,2,3,...) called "time". "genr 2031 index" does the same thing except that the variable is called index. 2032 2033 "genr dummy" creates dummy variables up to the periodicity of the data. In 2034 the case of quarterly data (periodicity 4), the program creates dq1 = 1 2035 for first quarter and 0 in other quarters, dq2 = 1 for the second quarter 2036 and 0 in other quarters, and so on. With monthly data the dummies are 2037 named dm1, dm2, and so on. With other frequencies the names are dummy_1, 2038 dummy_2, etc. 2039 2040 "genr unitdum" and "genr timedum" create sets of special dummy variables 2041 for use with panel data. The first codes for the cross-sectional units and 2042 the second for the time period of the observations. 2043 2044Note: In the command-line program, "genr" commands that retrieve 2045model-related data always reference the model that was estimated most 2046recently. This is also true in the GUI program, if one uses "genr" in the 2047"gretl console" or enters a formula using the "Define new variable" option 2048under the Add menu in the main window. With the GUI, however, you have the 2049option of retrieving data from any model currently displayed in a window 2050(whether or not it's the most recent model). You do this under the "Save" 2051menu in the model's window. 2052 2053The special variable obs serves as an index of the observations. For 2054instance series dum = (obs==15) will generate a dummy variable that has 2055value 1 for observation 15, 0 otherwise. You can also use this variable to 2056pick out particular observations by date or name. For example, series d = 2057(obs>1986:4), series d = (obs>"2008-04-01"), or series d = (obs=="CA"). If 2058daily dates or observation labels are used in this context, they should be 2059enclosed in double quotes. Quarterly and monthly dates (with a colon) may be 2060used unquoted. Note that in the case of annual time series data, the year is 2061not distinguishable syntactically from a plain integer; therefore if you 2062wish to compare observations against obs by year you must use the function 2063obsnum to convert the year to a 1-based index value, as in series d = 2064(obs>obsnum(1986)). 2065 2066Scalar values can be pulled from a series in the context of a genr formula, 2067using the syntax varname[obs]. The obs value can be given by number or date. 2068Examples: x[5], CPI[1996:01]. For daily data, the form YYYY-MM-DD should be 2069used, e.g. ibm[1970-01-23]. 2070 2071An individual observation in a series can be modified via genr. To do this, 2072a valid observation number or date, in square brackets, must be appended to 2073the name of the variable on the left-hand side of the formula. For example, 2074genr x[3] = 30 or genr x[1950:04] = 303.7. 2075 2076 Formula Comment 2077 ------- ------- 2078 y = x1^3 x1 cubed 2079 y = ln((x1+x2)/x3) 2080 z = x>y z(t) = 1 if x(t) > y(t), otherwise 0 2081 y = x(-2) x lagged 2 periods 2082 y = x(+2) x led 2 periods 2083 y = diff(x) y(t) = x(t) - x(t-1) 2084 y = ldiff(x) y(t) = log x(t) - log x(t-1), the instantaneous rate 2085 of growth of x 2086 y = sort(x) sorts x in increasing order and stores in y 2087 y = dsort(x) sort x in decreasing order 2088 y = int(x) truncate x and store its integer value as y 2089 y = abs(x) store the absolute values of x 2090 y = sum(x) sum x values excluding missing NA entries 2091 y = cum(x) cumulation: y(t) = the sum from s=1 to s=t of x(s) 2092 aa = $ess set aa equal to the Error Sum of Squares from last 2093 regression 2094 x = $coeff(sqft) grab the estimated coefficient on the variable sqft 2095 from the last regression 2096 rho4 = $rho(4) grab the 4th-order autoregressive coefficient from 2097 the last model (presumes an ar model) 2098 cvx1x2 = $vcv(x1, x2) grab the estimated coefficient covariance of vars x1 2099 and x2 from the last model 2100 foo = uniform() uniform pseudo-random variable in range 0-1 2101 bar = 3 * normal() normal pseudo-random variable, mu = 0, sigma = 3 2102 samp = ok(x) = 1 for observations where x is not missing. 2103 2104Menu path: /Add/Define new variable 2105Other access: Main window pop-up menu 2106 2107# gmm Estimation 2108 2109Options: --two-step (two step estimation) 2110 --iterate (iterated GMM) 2111 --vcv (print covariance matrix) 2112 --verbose (print details of iterations) 2113 --quiet (don't print anything) 2114 --lbfgs (use L-BFGS-B instead of regular BFGS) 2115Examples: hall_cbapm.inp 2116 2117Performs Generalized Method of Moments (GMM) estimation using the BFGS 2118(Broyden, Fletcher, Goldfarb, Shanno) algorithm. You must specify one or 2119more commands for updating the relevant quantities (typically GMM 2120residuals), one or more sets of orthogonality conditions, an initial matrix 2121of weights, and a listing of the parameters to be estimated, all enclosed 2122between the tags gmm and end gmm. Any options should be appended to the end 2123gmm line. 2124 2125Please see chapter 27 of the Gretl User's Guide for details on this command. 2126Here we just illustrate with a simple example. 2127 2128 gmm e = y - X*b 2129 orthog e ; W 2130 weights V 2131 params b 2132 end gmm 2133 2134In the example above we assume that y and X are data matrices, b is an 2135appropriately sized vector of parameter values, W is a matrix of 2136instruments, and V is a suitable matrix of weights. The statement 2137 2138 orthog e ; W 2139 2140indicates that the residual vector e is in principle orthogonal to each of 2141the instruments composing the columns of W. 2142 2143Parameter names 2144 2145In estimating a nonlinear model it is often convenient to name the 2146parameters tersely. In printing the results, however, it may be desirable to 2147use more informative labels. This can be achieved via the additional keyword 2148param_names within the command block. For a model with k parameters the 2149argument following this keyword should be a double-quoted string literal 2150holding k space-separated names, the name of a string variable that holds k 2151such names, or the name of an array of k strings. 2152 2153Menu path: /Model/Instrumental variables/GMM 2154 2155# gnuplot Graphs 2156 2157Arguments: yvars xvar [ dumvar ] 2158Options: --with-lines[=varspec] (use lines, not points) 2159 --with-lp[=varspec] (use lines and points) 2160 --with-impulses[=varspec] (use vertical lines) 2161 --with-steps[=varspec] (use perpendicular line segments) 2162 --time-series (plot against time) 2163 --single-yaxis (force use of just one y-axis) 2164 --ylogscale[=base] (use log scale for vertical axis) 2165 --dummy (see below) 2166 --fit=fitspec (see below) 2167 --font=fontspec (see below) 2168 --band=bandspec (see below) 2169 --band-style=style (see below) 2170 --matrix=name (plot columns of named matrix) 2171 --output=filename (send output to specified file) 2172 --input=filename (take input from specified file) 2173Examples: gnuplot y1 y2 x 2174 gnuplot x --time-series --with-lines 2175 gnuplot wages educ gender --dummy 2176 gnuplot y x --fit=quadratic 2177 gnuplot y1 y2 x --with-lines=y2 2178 2179The variables in the list yvars are graphed against xvar. For a time series 2180plot you may either give time as xvar or use the option flag --time-series. 2181See also the "plot" and "panplot" commands. 2182 2183By default, data-points are shown as points; this can be overridden by 2184giving one of the options --with-lines, --with-lp, --with-impulses or 2185--with-steps. If more than one variable is to be plotted on the y axis, the 2186effect of these options may be confined to a subset of the variables by 2187using the varspec parameter. This should take the form of a comma-separated 2188listing of the names or numbers of the variables to be plotted with lines or 2189impulses respectively. For instance, the final example above shows how to 2190plot y1 and y2 against x, such that y2 is represented by a line but y1 by 2191points. 2192 2193If the --dummy option is selected, exactly three variables should be given: 2194a single y variable, an x variable, and dvar, a discrete variable. The 2195effect is to plot yvar against xvar with the points shown in different 2196colors depending on the value of dvar at the given observation. 2197 2198You can choose the scale for the y axis to be logarithmic rather than linear 2199by using the --ylogscale option, together with a base parameter. For 2200example, 2201 2202 gnuplot y x --ylogscale=2 2203 2204plots the data such that the vertical axis is expressed as powers of 2. If 2205the base is omitted, it defaults to 10. 2206 2207Taking data from a matrix 2208 2209Generally, the arguments yvars and xvar are required, and refer to series in 2210the current dataset (given either by name or ID number). But if a named 2211matrix is supplied via the --matrix option these arguments become optional: 2212if the specified matrix has k columns, by default the first k - 1 columns 2213are treated as the yvars and the last column as xvar. If the --time-series 2214option is given, however, all k columns are plotted against time. If you 2215wish to plot selected columns of the matrix, you should specify yvars and 2216xvar in the form of 1-based column numbers. For example if you want a 2217scatterplot of column 2 of matrix M against column 1, you can do: 2218 2219 gnuplot 2 1 --matrix=M 2220 2221Showing a line of best fit 2222 2223The --fit option is applicable only for bivariate scatterplots and single 2224time-series plots. The default behavior for a scatterplot is to show the OLS 2225fit if the slope coefficient is significant at the 10 percent level, while 2226the default behavior for time-series is not to show any fitted line. You can 2227call for different behavior by using this option along with one of the 2228following fitspec parameter values. Note that if the plot is a single time 2229series the place of x is taken by time. 2230 2231 linear: show the OLS fit regardless of its level of statistical 2232 significance. 2233 2234 none: don't show any fitted line. 2235 2236 inverse, quadratic, cubic, semilog or linlog: show a fitted line based on 2237 a regression of the specified type. By semilog, we mean a regression of 2238 log y on x; the fitted line represents the conditional expectation of y, 2239 obtained by exponentiation. By linlog we mean a regression of y on the log 2240 of x. 2241 2242 loess: show the fit from a robust locally weighted regression (also is 2243 sometimes known as "lowess"). 2244 2245Plotting a band 2246 2247The --band option can be used for plotting zero or more series along with a 2248"band" of some sort (typically representing a confidence interval). This 2249option requires two comma-separated parameters: the name or ID number of a 2250series representing the center of the band, and the name or ID of a series 2251giving the width of the band: the effect is to draw a band with y 2252coordinates equal to center minus width and center plus width. An optional 2253third parameter (again, comma-separated) can be used to give a multiplier 2254for the width dimension, in the form of a numerical constant or the name of 2255a scalar variable. So for example, the following example plots y along with 2256a band of plus or minus 1.96 times se_y: 2257 2258 gnuplot y --time-series --band=y,se_y,1.96 --with-lines 2259 2260When the --band option is given, the companion option --band-style can be 2261used to control the band's representation. By default the upper and lower 2262limits are shown as solid lines, but the parameters fill, dash, bars or step 2263cause the band to be drawn as a shaded area, using dashed lines, using error 2264bars or using steps, respectively. In addition a color specification can be 2265appended (following a comma) or substituted. Here are some style examples: 2266 2267 gnuplot ... --band-style=fill 2268 gnuplot ... --band-style=dash,0xbbddff 2269 gnuplot ... --band-style=,black 2270 gnuplot ... --band-style=bars,blue 2271 2272The first example produces a shaded area in the default color; the second 2273switches to dashed lines with a specified blue-gray color; the third uses 2274solid black lines; and the last shows blue bars. Note that colors can be 2275given as either hexadecimal RGB values or by name; you can access the list 2276of color-names recognized by gnuplot by issuing the command "show 2277colornames" in gnuplot itself, or in the gretl console by doing 2278 2279 eval readfile("@gretldir/data/gnuplot/gpcolors.txt") 2280 2281Recession bars 2282 2283The "band" options described above can also be used to add "recession bars" 2284to a plot. By this we mean vertical bars occupying the full y-dimension of 2285the plot and indicating the presence (bar) or absence (no bar) of some 2286qualitative feature in a time-series plot. Such bars are commonly used to 2287flag periods of recession; they could also be used to indicate periods of 2288war, or anything that can be coded in a 0/1 dummy variable. 2289 2290In this context the --band option requires a single parameter: the 2291identifier of a series with values 0 and 1, where 1 indicates "on" and 0 2292"off". The --band-style option may be used to specify a color for the bars, 2293given in hexadecimal form or as the name of a color known to gnuplot (see 2294the previous section). An example showing a single bar is given below: 2295 2296 open AWM17 --quiet 2297 series dum = obs >= 1990:1 && obs <= 1994:2 2298 gnuplot YER URX --with-lines --time-series \ 2299 --band=dum --band-style=0xcccccc --output=display \ 2300 {set key top left;} 2301 2302Controlling the output 2303 2304In interactive mode the plot is displayed immediately. In batch mode the 2305default behavior is that a gnuplot command file is written in the user's 2306working directory, with a name on the pattern gpttmpN.plt, starting with N = 230701. The actual plots may be generated later using gnuplot (under MS Windows, 2308wgnuplot). This behavior can be modified by use of the --output=filename 2309option. This option controls the filename used, and at the same time allows 2310you to specify a particular output format via the three-letter extension of 2311the file name, as follows: .eps results in the production of an Encapsulated 2312PostScript (EPS) file; .pdf produces PDF; .png produces PNG format, .emf 2313calls for EMF (Enhanced MetaFile), .fig calls for an Xfig file, and .svg for 2314SVG (Scalable Vector Graphics). If the dummy filename "display" is given 2315then the plot is shown on screen as in interactive mode. If a filename with 2316any extension other than those just mentioned is given, a gnuplot command 2317file is written. 2318 2319Specifying a font 2320 2321The --font option can be used to specify a particular font for the plot. The 2322fontspec parameter should take the form of the name of a font, optionally 2323followed by a size in points separated from the name by a comma or space, 2324all wrapped in double quotes, as in 2325 2326 --font="serif,12" 2327 2328Note that the fonts available to gnuplot will vary by platform, and if 2329you're writing a plot command that is intended to be portable it is best to 2330restrict the font name to the generic sans or serif. 2331 2332Adding gnuplot commands 2333 2334A further option to this command is available: following the specification 2335of the variables to be plotted and the option flag (if any), you may add 2336literal gnuplot commands to control the appearance of the plot (for example, 2337setting the plot title and/or the axis ranges). These commands should be 2338enclosed in braces, and each gnuplot command must be terminated with a 2339semi-colon. A backslash may be used to continue a set of gnuplot commands 2340over more than one line. Here is an example of the syntax: 2341 2342 { set title 'My Title'; set yrange [0:1000]; } 2343 2344Menu path: /View/Graph specified vars 2345Other access: Main window pop-up menu, graph button on toolbar 2346 2347# graphpg Graphs 2348 2349Variants: graphpg add 2350 graphpg fontscale value 2351 graphpg show 2352 graphpg free 2353 graphpg --output=filename 2354 2355The session "graph page" will work only if you have the LaTeX typesetting 2356system installed, and are able to generate and view PDF or PostScript 2357output. 2358 2359In the session icon window, you can drag up to eight graphs onto the graph 2360page icon. When you double-click on the graph page (or right-click and 2361select "Display"), a page containing the selected graphs will be composed 2362and opened in a suitable viewer. From there you should be able to print the 2363page. 2364 2365To clear the graph page, right-click on its icon and select "Clear". 2366 2367Note that on systems other than MS Windows, you may have to adjust the 2368setting for the program used to view PDF or PostScript files. Find that 2369under the "Programs" tab in the gretl Preferences dialog box (under the 2370Tools menu in the main window). 2371 2372It's also possible to operate on the graph page via script, or using the 2373console (in the GUI program). The following commands and options are 2374supported: 2375 2376To add a graph to the graph page, issue the command graphpg add after saving 2377a named graph, as in 2378 2379 grf1 <- gnuplot Y X 2380 graphpg add 2381 2382To display the graph page: graphpg show. 2383 2384To clear the graph page: graphpg free. 2385 2386To adjust the scale of the font used in the graph page, use graphpg 2387fontscale scale, where scale is a multiplier (with a default of 1.0). Thus 2388to make the font size 50 percent bigger than the default you can do 2389 2390 graphpg fontscale 1.5 2391 2392To call for printing of the graph page to file, use the flag --output= plus 2393a filename; the filename should have the suffix ".pdf", ".ps" or ".eps". For 2394example: 2395 2396 graphpg --output="myfile.pdf" 2397 2398The output file will be written in the currently set "workdir", unless the 2399filename string contains a full path specification. 2400 2401In this context the output uses colored lines by default; to use dot/dash 2402patterns instead of colors you can append the --monochrome flag. 2403 2404# heckit Estimation 2405 2406Arguments: depvar indepvars ; selection equation 2407Options: --quiet (suppress printing of results) 2408 --two-step (perform two-step estimation) 2409 --vcv (print covariance matrix) 2410 --opg (OPG standard errors) 2411 --robust (QML standard errors) 2412 --cluster=clustvar (see "logit" for explanation) 2413 --verbose (print extra output) 2414Examples: heckit y 0 x1 x2 ; ys 0 x3 x4 2415 See also heckit.inp 2416 2417Heckman-type selection model. In the specification, the list before the 2418semicolon represents the outcome equation, and the second list represents 2419the selection equation. The dependent variable in the selection equation (ys 2420in the example above) must be a binary variable. 2421 2422By default, the parameters are estimated by maximum likelihood. The 2423covariance matrix of the parameters is computed using the negative inverse 2424of the Hessian. If two-step estimation is desired, use the --two-step 2425option. In this case, the covariance matrix of the parameters of the outcome 2426equation is appropriately adjusted as per Heckman (1979). 2427 2428Menu path: /Model/Limited dependent variable/Heckit 2429 2430# help Utilities 2431 2432Variants: help 2433 help functions 2434 help command 2435 help function 2436Option: --func (select functions help) 2437 2438If no arguments are given, prints a list of available commands. If the 2439single argument "functions" is given, prints a list of available functions 2440(see "genr"). 2441 2442help command describes command (e.g. help smpl). help function describes 2443function (e.g. help ldet). Some functions have the same names as related 2444commands (e.g. diff): in that case the default is to print help for the 2445command, but you can get help on the function by using the --func option. 2446 2447Menu path: /Help 2448 2449# hfplot Graphs 2450 2451Arguments: hflist [ ; lflist ] 2452Options: --with-lines (plot with lines) 2453 --time-series (put time on x-axis) 2454 --output=filename (send output to specified file) 2455 2456Provides a means of plotting a high-frequency series, possibly along with 2457one or more series observed at the base frequency of the dataset. The first 2458argument should be a "MIDAS list"; the optional additional lflist terms, 2459following a semicolon, should be regular ("low-frequency") series. 2460 2461For details on the effect of the --output option, please see the "gnuplot" 2462command. 2463 2464# hsk Estimation 2465 2466Arguments: depvar indepvars 2467Options: --no-squares (see below) 2468 --vcv (print covariance matrix) 2469 --quiet (don't print anything) 2470 2471This command is applicable where heteroskedasticity is present in the form 2472of an unknown function of the regressors which can be approximated by a 2473quadratic relationship. In that context it offers the possibility of 2474consistent standard errors and more efficient parameter estimates as 2475compared with OLS. 2476 2477The procedure involves (a) OLS estimation of the model of interest, followed 2478by (b) an auxiliary regression to generate an estimate of the error 2479variance, then finally (c) weighted least squares, using as weight the 2480reciprocal of the estimated variance. 2481 2482In the auxiliary regression (b) we regress the log of the squared residuals 2483from the first OLS on the original regressors and their squares (by 2484default), or just on the original regressors (if the --no-squares option is 2485given). The log transformation is performed to ensure that the estimated 2486variances are all non-negative. Call the fitted values from this regression 2487u^*. The weight series for the final WLS is then formed as 1/exp(u^*). 2488 2489Menu path: /Model/Other linear models/Heteroskedasticity corrected 2490 2491# hurst Statistics 2492 2493Argument: series 2494Option: --plot=mode-or-filename (see below) 2495 2496Calculates the Hurst exponent (a measure of persistence or long memory) for 2497a time-series variable having at least 128 observations. The result 2498(together with its standard error) can be retrieved via the "$result" 2499accessor. 2500 2501The Hurst exponent is discussed by Mandelbrot (1983). In theoretical terms 2502it is the exponent, H, in the relationship 2503 2504 RS(x) = an^H 2505 2506where RS is the "rescaled range" of the variable x in samples of size n and 2507a is a constant. The rescaled range is the range (maximum minus minimum) of 2508the cumulated value or partial sum of x over the sample period (after 2509subtraction of the sample mean), divided by the sample standard deviation. 2510 2511As a reference point, if x is white noise (zero mean, zero persistence) then 2512the range of its cumulated "wandering" (which forms a random walk), scaled 2513by the standard deviation, grows as the square root of the sample size, 2514giving an expected Hurst exponent of 0.5. Values of the exponent 2515significantly in excess of 0.5 indicate persistence, and values less than 25160.5 indicate anti-persistence (negative autocorrelation). In principle the 2517exponent is bounded by 0 and 1, although in finite samples it is possible to 2518get an estimated exponent greater than 1. 2519 2520In gretl, the exponent is estimated using binary sub-sampling: we start with 2521the entire data range, then the two halves of the range, then the four 2522quarters, and so on. For sample sizes smaller than the data range, the RS 2523value is the mean across the available samples. The exponent is then 2524estimated as the slope coefficient in a regression of the log of RS on the 2525log of sample size. 2526 2527By default, if the program is not in batch mode a plot of the rescaled range 2528is shown. This can be adjusted via the --plot option. The acceptable 2529parameters to this option are none (to suppress the plot); display (to 2530display a plot even when in batch mode); or a file name. The effect of 2531providing a file name is as described for the --output option of the 2532"gnuplot" command. 2533 2534Menu path: /Variable/Hurst exponent 2535 2536# if Programming 2537 2538Flow control for command execution. Three sorts of construction are 2539supported, as follows. 2540 2541 # simple form 2542 if condition 2543 commands 2544 endif 2545 2546 # two branches 2547 if condition 2548 commands1 2549 else 2550 commands2 2551 endif 2552 2553 # three or more branches 2554 if condition1 2555 commands1 2556 elif condition2 2557 commands2 2558 else 2559 commands3 2560 endif 2561 2562"condition" must be a Boolean expression, for the syntax of which see 2563"genr". More than one "elif" block may be included. In addition, if ... 2564endif blocks may be nested. 2565 2566# include Programming 2567 2568Argument: filename 2569Option: --force (force re-reading from file) 2570Examples: include myfile.inp 2571 include sols.gfn 2572 2573Intended for use in a command script, primarily for including definitions of 2574functions. filename should have the extension inp (a plain-text script) or 2575gfn (a gretl function package). The commands in filename are executed then 2576control is returned to the main script. 2577 2578The --force option is specific to gfn files: its effect is to force gretl to 2579re-read the function package from file even if it is already loaded into 2580memory. (Plain inp files are always read and processed in response to this 2581command.) 2582 2583See also "run". 2584 2585# info Dataset 2586 2587Prints out any supplementary information stored with the current datafile. 2588 2589Menu path: /Data/Dataset info 2590Other access: Data browser windows 2591 2592# intreg Estimation 2593 2594Arguments: minvar maxvar indepvars 2595Options: --quiet (suppress printing of results) 2596 --verbose (print details of iterations) 2597 --robust (robust standard errors) 2598 --opg (see below) 2599 --cluster=clustvar (see "logit" for explanation) 2600Examples: intreg lo hi const x1 x2 2601 See also wtp.inp 2602 2603Estimates an interval regression model. This model arises when the dependent 2604variable is imperfectly observed for some (possibly all) observations. In 2605other words, the data generating process is assumed to be 2606 2607 y* = x b + u 2608 2609but we only observe m <= y* <= M (the interval may be left- or 2610right-unbounded). Note that for some observations m may equal M. The 2611variables minvar and maxvar must contain NAs for left- and right-unbounded 2612observations, respectively. 2613 2614The model is estimated by maximum likelihood, assuming normality of the 2615disturbance term. 2616 2617By default, standard errors are computed using the negative inverse of the 2618Hessian. If the --robust flag is given, then QML or Huber-White standard 2619errors are calculated instead. In this case the estimated covariance matrix 2620is a "sandwich" of the inverse of the estimated Hessian and the outer 2621product of the gradient. Alternatively, the --opg option can be given, in 2622which case standard errors are based on the outer product of the gradient 2623alone. 2624 2625Menu path: /Model/Limited dependent variable/Interval regression 2626 2627# johansen Tests 2628 2629Arguments: order ylist [ ; xlist ] [ ; rxlist ] 2630Options: --nc (no constant) 2631 --rc (restricted constant) 2632 --uc (unrestricted constant) 2633 --crt (constant and restricted trend) 2634 --ct (constant and unrestricted trend) 2635 --seasonals (include centered seasonal dummies) 2636 --asy (record asymptotic p-values) 2637 --quiet (print just the tests) 2638 --silent (don't print anything) 2639 --verbose (print details of auxiliary regressions) 2640Examples: johansen 2 y x 2641 johansen 4 y x1 x2 --verbose 2642 johansen 3 y x1 x2 --rc 2643 See also hamilton.inp, denmark.inp 2644 2645Carries out the Johansen test for cointegration among the variables in ylist 2646for the given lag order. For details of this test see chapter 33 of the 2647Gretl User's Guide or Hamilton (1994), Chapter 20. P-values are computed via 2648Doornik's gamma approximation (Doornik, 1998). Two sets of p-values are 2649shown for the trace test, straight asymptotic values and values adjusted for 2650the sample size. By default the "$pvalue" accessor gets the adjusted 2651variant, but the --asy flag may be used to record the asymptotic values 2652instead. 2653 2654The inclusion of deterministic terms in the model is controlled by the 2655option flags. The default if no option is specified is to include an 2656"unrestricted constant", which allows for the presence of a non-zero 2657intercept in the cointegrating relations as well as a trend in the levels of 2658the endogenous variables. In the literature stemming from the work of 2659Johansen (see for example his 1995 book) this is often referred to as "case 26603". The first four options given above, which are mutually exclusive, 2661produce cases 1, 2, 4 and 5 respectively. The meaning of these cases and the 2662criteria for selecting a case are explained in chapter 33 of the Gretl 2663User's Guide. 2664 2665The optional lists xlist and rxlist allow you to control for specified 2666exogenous variables: these enter the system either unrestrictedly (xlist) or 2667restricted to the cointegration space (rxlist). These lists are separated 2668from ylist and from each other by semicolons. 2669 2670The --seasonals option, which may be combined with any of the other options, 2671specifies the inclusion of a set of centered seasonal dummy variables. This 2672option is available only for quarterly or monthly data. 2673 2674The following table is offered as a guide to the interpretation of the 2675results shown for the test, for the 3-variable case. H0 denotes the null 2676hypothesis, H1 the alternative hypothesis, and c the number of cointegrating 2677relations. 2678 2679 Rank Trace test Lmax test 2680 H0 H1 H0 H1 2681 --------------------------------------- 2682 0 c = 0 c = 3 c = 0 c = 1 2683 1 c = 1 c = 3 c = 1 c = 2 2684 2 c = 2 c = 3 c = 2 c = 3 2685 --------------------------------------- 2686 2687See also the "vecm" command, and "coint" if you want the Engle-Granger 2688cointegration test. 2689 2690Menu path: /Model/Multivariate time series 2691 2692# join Dataset 2693 2694Arguments: filename varname 2695Options: --data=column-name (see below) 2696 --filter=expression (see below) 2697 --ikey=inner-key (see below) 2698 --okey=outer-key (see below) 2699 --aggr=method (see below) 2700 --tkey=column-name,format-string (see below) 2701 --verbose (report on progress) 2702 2703This command imports one or more data series from the source filename (which 2704must be either a delimited text data file or a "native" gretl data file) 2705under the name varname. For details please see chapter 7 of the Gretl User's 2706Guide; here we just give a brief summary of the available options. See also 2707"append" for simpler joining operations. 2708 2709The --data option can be used to specify the column heading of the data in 2710the source file, if this differs from the name by which the data should be 2711known in gretl. 2712 2713The --filter option can be used to specify a criterion for filtering the 2714source data (that is, selecting a subset of observations). 2715 2716The --ikey and --okey options can be used to specify a mapping between 2717observations in the current dataset and observations in the source data (for 2718example, individuals can be matched against the household to which they 2719belong). 2720 2721The --aggr option is used when the mapping between observations in the 2722current dataset and the source is not one-to-one. 2723 2724The --tkey option is applicable only when the current dataset has a 2725time-series structure. It can be used to specify the name of a column 2726containing dates to be matched to the dataset and/or the format in which 2727dates are represented in that column. 2728 2729Importing more than one series at once 2730 2731The "join" command can handle the importation of several series at once. 2732This happens when (a) the varname argument is a space-separated list of 2733names rather than a single name, or (b) when it points to an array of 2734strings: the elements of this array should be the names of the series to 2735import. 2736 2737This methods has some limitations, however: the --data option is not 2738available. When importing multiple series you are obliged to accept their 2739"outer" names. The other options apply uniformly to all the series imported 2740via a given command. 2741 2742# kpss Tests 2743 2744Arguments: order varlist 2745Options: --trend (include a trend) 2746 --seasonals (include seasonal dummies) 2747 --verbose (print regression results) 2748 --quiet (suppress printing of results) 2749 --difference (use first difference of variable) 2750Examples: kpss 8 y 2751 kpss 4 x1 --trend 2752 2753For use of this command with panel data please see the final section in this 2754entry. 2755 2756Computes the KPSS test (Kwiatkowski et al, Journal of Econometrics, 1992) 2757for stationarity, for each of the specified variables (or their first 2758difference, if the --difference option is selected). The null hypothesis is 2759that the variable in question is stationary, either around a level or, if 2760the --trend option is given, around a deterministic linear trend. 2761 2762The order argument determines the size of the window used for Bartlett 2763smoothing. If a negative value is given this is taken as a signal to use an 2764automatic window size of 4(T/100)^0.25, where T is the sample size. 2765 2766If the --verbose option is chosen the results of the auxiliary regression 2767are printed, along with the estimated variance of the random walk component 2768of the variable. 2769 2770The critical values shown for the test statistic are based on response 2771surfaces estimated in the manner set out by Sephton (Economics Letters, 27721995), which are more accurate for small samples than the values given in 2773the original KPSS article. When the test statistic lies between the 10 2774percent and 1 percent critical values a p-value is shown; this is obtained 2775by linear interpolation and should not be taken too literally. See the 2776"kpsscrit" function for a means of obtaining these critical values 2777programmatically. 2778 2779Panel data 2780 2781When the kpss command is used with panel data, to produce a panel unit root 2782test, the applicable options and the results shown are somewhat different. 2783While you may give a list of variables for testing in the regular 2784time-series case, with panel data only one variable may be tested per 2785command. And the --verbose option has a different meaning: it produces a 2786brief account of the test for each individual time series (the default being 2787to show only the overall result). 2788 2789When possible, the overall test (null hypothesis: the series in question is 2790stationary for all the panel units) is calculated using the method of Choi 2791(Journal of International Money and Finance, 2001). This is not always 2792straightforward, the difficulty being that while the Choi test is based on 2793the p-values of the tests on the individual series, we do not currently have 2794a means of calculating p-values for the KPSS test statistic; we must rely on 2795a few critical values. 2796 2797If the test statistic for a given series falls between the 10 percent and 1 2798percent critical values, we are able to interpolate a p-value. But if the 2799test falls short of the 10 percent value, or exceeds the 1 percent value, we 2800cannot interpolate and can at best place a bound on the global Choi test. If 2801the individual test statistic falls short of the 10 percent value for some 2802units but exceeds the 1 percent value for others, we cannot even compute a 2803bound for the global test. 2804 2805Menu path: /Variable/Unit root tests/KPSS test 2806 2807# labels Dataset 2808 2809Variants: labels [ varlist ] 2810 labels --to-file=filename 2811 labels --from-file=filename 2812 labels --delete 2813Examples: oprobit.inp 2814 2815In the first form, prints out the informative labels (if present) for the 2816series in varlist, or for all series in the dataset if varlist is not 2817specified. 2818 2819With the option --to-file, writes to the named file the labels for all 2820series in the dataset, one per line. If no labels are present an error is 2821flagged; if some series have labels and others do not, a blank line is 2822printed for series with no label. The output file will be written in the 2823currently set "workdir", unless the filename string contains a full path 2824specification. 2825 2826With the option --from-file, reads the specified file (which should be plain 2827text) and assigns labels to the series in the dataset, reading one label per 2828line and taking blank lines to indicate blank labels. 2829 2830The --delete option does what you'd expect: it removes all the series labels 2831from the dataset. 2832 2833Menu path: /Data/Variable labels 2834 2835# lad Estimation 2836 2837Arguments: depvar indepvars 2838Options: --vcv (print covariance matrix) 2839 --no-vcv (don't compute covariance matrix) 2840 --quiet (don't print anything) 2841 2842Calculates a regression that minimizes the sum of the absolute deviations of 2843the observed from the fitted values of the dependent variable. Coefficient 2844estimates are derived using the Barrodale-Roberts simplex algorithm; a 2845warning is printed if the solution is not unique. 2846 2847Standard errors are derived using the bootstrap procedure with 500 drawings. 2848The covariance matrix for the parameter estimates, printed when the --vcv 2849flag is given, is based on the same bootstrap. Since this is quite an 2850expensive operation, the --no-vcv option is provided for the case where the 2851covariance matrix is not required; when this option is given standard errors 2852will not be available. 2853 2854Note that this method can be slow when the sample is large or there are many 2855regressors; in that case it may be preferable to use the "quantreg" command. 2856Given a dependent variable y and a list of regressors X, the following 2857commands are basically equivalent, except that the quantreg method uses the 2858faster Frisch-Newton algorithm and provides analytical rather than 2859bootstrapped standard errors. 2860 2861 lad y const X 2862 quantreg 0.5 y const X 2863 2864Menu path: /Model/Robust estimation/Least Absolute Deviation 2865 2866# lags Transformations 2867 2868Arguments: [ order ; ] laglist 2869Option: --bylag (order terms by lag) 2870Examples: lags x y 2871 lags 12 ; x y 2872 lags 4 ; x1 x2 x3 --bylag 2873 See also sw_ch12.inp, sw_ch14.inp 2874 2875Creates new series which are lagged values of each of the series in varlist. 2876By default the number of lags created equals the periodicity of the data. 2877For example, if the periodicity is 4 (quarterly), the command "lags x" 2878creates 2879 2880 x_1 = x(t-1) 2881 x_2 = x(t-2) 2882 x_3 = x(t-3) 2883 x_4 = x(t-4) 2884 2885The number of lags created can be controlled by the optional first parameter 2886(which, if present, must be followed by a semicolon). 2887 2888The --bylag option is meaningful only if varlist contains more than one 2889series and the maximum lag order is greater than 1. By default the lagged 2890terms are added to the dataset by variable: first all lags of the first 2891series, then all lags of the second series, and so on. But if --bylag is 2892given, the ordering is by lags: first lag 1 of all the listed series, then 2893lag 2 of all the list series, and so on. 2894 2895Menu path: /Add/Lags of selected variables 2896 2897# ldiff Transformations 2898 2899Argument: varlist 2900 2901The first difference of the natural log of each series in varlist is 2902obtained and the result stored in a new series with the prefix ld_. Thus 2903"ldiff x y" creates the new variables 2904 2905 ld_x = log(x) - log(x(-1)) 2906 ld_y = log(y) - log(y(-1)) 2907 2908Menu path: /Add/Log differences of selected variables 2909 2910# leverage Tests 2911 2912Options: --save (save the resulting series) 2913 --overwrite (OK to overwrite existing series) 2914 --quiet (don't print results) 2915 --plot=mode-or-filename (see below) 2916Examples: leverage.inp 2917 2918Must follow an "ols" command. Calculates the leverage (h, which must lie in 2919the range 0 to 1) for each data point in the sample on which the previous 2920model was estimated. Displays the residual (u) for each observation along 2921with its leverage and a measure of its influence on the estimates, uh/(1 - 2922h). "Leverage points" for which the value of h exceeds 2k/n (where k is the 2923number of parameters being estimated and n is the sample size) are flagged 2924with an asterisk. For details on the concepts of leverage and influence see 2925Davidson and MacKinnon (1993), Chapter 2. 2926 2927DFFITS values are also computed: these are "studentized residuals" 2928(predicted residuals divided by their standard errors) multiplied by 2929sqrt[h/(1 - h)]. For discussions of studentized residuals and DFFITS see 2930chapter 12 of Maddala's Introduction to Econometrics or Belsley, Kuh and 2931Welsch (1980). 2932 2933Briefly, a "predicted residual" is the difference between the observed value 2934of the dependent variable at observation t, and the fitted value for 2935observation t obtained from a regression in which that observation is 2936omitted (or a dummy variable with value 1 for observation t alone has been 2937added); the studentized residual is obtained by dividing the predicted 2938residual by its standard error. 2939 2940If the --save flag is given with this command, the leverage, influence and 2941DFFITS values are added to the current data set; in this context the --quiet 2942flag may be used to suppress the printing of results. The default names of 2943the saved series are, respectively, lever, influ and dffits. If series of 2944these names already exist, what happens depends on whether the --overwrite 2945option is given. If so, the existing series are overwritten; if not, the 2946names will be adjusted to ensure uniqueness. In the latter case the newly 2947generated series will always be the highest-numbered three series in the 2948dataset. 2949 2950After execution, the "$test" accessor returns the cross-validation 2951criterion, which is defined as the sum of squared deviations of the 2952dependent variable from its forecast value, the forecast for each 2953observation being based on a sample from which that observation is excluded. 2954(This is known as the leave-one-out estimator). For a broader discussion of 2955the cross-validation criterion, see Davidson and MacKinnon's Econometric 2956Theory and Methods, pages 685-686, and the references therein. 2957 2958By default, if this command is invoked interactively a plot of the leverage 2959and influence values is shown. This can be adjusted via the --plot option. 2960The acceptable parameters to this option are none (to suppress the plot); 2961display (to display a plot even when in script mode); or a file name. The 2962effect of providing a file name is as described for the --output option of 2963the "gnuplot" command. 2964 2965Menu path: Model window, /Analysis/Influential observations 2966 2967# levinlin Tests 2968 2969Arguments: order series 2970Options: --nc (test without a constant) 2971 --ct (with constant and trend) 2972 --quiet (suppress printing of results) 2973 --verbose (print per-unit results) 2974Examples: levinlin 0 y 2975 levinlin 2 y --ct 2976 levinlin {2,2,3,3,4,4} y 2977 2978Carries out the panel unit-root test described by Levin, Lin and Chu (2002). 2979The null hypothesis is that all of the individual time series exhibit a unit 2980root, and the alternative is that none of the series has a unit root. (That 2981is, a common AR(1) coefficient is assumed, although in other respects the 2982statistical properties of the series are allowed to vary across 2983individuals.) 2984 2985By default the test ADF regressions include a constant; to suppress the 2986constant use the --nc option, or to add a linear trend use the --ct option. 2987(See the "adf" command for explanation of ADF regressions.) 2988 2989The (non-negative) order for the test (governing the number of lags of the 2990dependent variable to include in the ADF regressions) may be given in either 2991of two forms. If a scalar value is given, this is applied to all the 2992individuals in the panel. The alternative is to provide a matrix containing 2993a specific lag order for each individual; this must be a vector with as many 2994elements as there are individuals in the current sample range. Such a matrix 2995can be specified by name, or constructed using braces as illustrated in the 2996last example above. 2997 2998When the --verbose option is given, the following results are printed for 2999each unit in the panel: delta, the coefficient on the lagged level in each 3000ADF regression; s2e, the estimated variance of the innovations; and s2y, the 3001estimated long-run variance of the differenced series. 3002 3003Note that panel unit-root tests can also be conducted using the "adf" and 3004"kpss" commands. 3005 3006Menu path: /Variable/Unit root tests/Levin-Lin-Chu test 3007 3008# logistic Estimation 3009 3010Arguments: depvar indepvars 3011Options: --ymax=value (specify maximum of dependent variable) 3012 --robust (robust standard errors) 3013 --cluster=clustvar (see "logit" for explanation) 3014 --vcv (print covariance matrix) 3015 --fixed-effects (see below) 3016 --quiet (don't print anything) 3017Examples: logistic y const x 3018 logistic y const x --ymax=50 3019 3020Logistic regression: carries out an OLS regression using the logistic 3021transformation of the dependent variable, 3022 3023 log(y/(y* - y)) 3024 3025In the case of panel data the specification may include individual fixed 3026effects. 3027 3028The dependent variable must be strictly positive. If all its values lie 3029between 0 and 1, the default is to use a y^* value (the asymptotic maximum 3030of the dependent variable) of 1; if its values lie between 0 and 100, the 3031default y^* is 100. 3032 3033If you wish to set a different maximum, use the --ymax option. Note that the 3034supplied value must be greater than all of the observed values of the 3035dependent variable. 3036 3037The fitted values and residuals from the regression are automatically 3038adjusted using the inverse of the logistic transformation: 3039 3040 y =~ E(y* / (1 + exp(-x))) 3041 3042where x represents either a fitted value or a residual from the OLS 3043regression using the logistic dependent variable. The reported values are 3044therefore comparable with the original dependent variable. The need for 3045approximation arises because the inverse transformation is nonlinear and 3046therefore does not conserve expectation. 3047 3048The --fixed-effects option is applicable only if the dataset takes the form 3049of a panel. In that case we subtract the group means from the logistic 3050transform of the dependent variable and estimation proceeds as usual for 3051fixed effects. 3052 3053Note that if the dependent variable is binary, you should use the "logit" 3054command instead. 3055 3056Menu path: /Model/Limited dependent variable/Logistic 3057Menu path: /Model/Panel/FE logistic 3058 3059# logit Estimation 3060 3061Arguments: depvar indepvars 3062Options: --robust (robust standard errors) 3063 --cluster=clustvar (clustered standard errors) 3064 --multinomial (estimate multinomial logit) 3065 --vcv (print covariance matrix) 3066 --verbose (print details of iterations) 3067 --quiet (don't print results) 3068 --p-values (show p-values instead of slopes) 3069 --estrella (select pseudo-R-squared variant) 3070Examples: keane.inp, oprobit.inp 3071 3072If the dependent variable is a binary variable (all values are 0 or 1) 3073maximum likelihood estimates of the coefficients on indepvars are obtained 3074via the Newton-Raphson method. As the model is nonlinear the slopes depend 3075on the values of the independent variables. By default the slopes with 3076respect to each of the independent variables are calculated (at the means of 3077those variables) and these slopes replace the usual p-values in the 3078regression output. This behavior can be suppressed by giving the --p-values 3079option. The chi-square statistic tests the null hypothesis that all 3080coefficients are zero apart from the constant. 3081 3082By default, standard errors are computed using the negative inverse of the 3083Hessian. If the --robust flag is given, then QML or Huber-White standard 3084errors are calculated instead. In this case the estimated covariance matrix 3085is a "sandwich" of the inverse of the estimated Hessian and the outer 3086product of the gradient; see chapter 10 of Davidson and MacKinnon (2004). 3087But if the --cluster option is given, then "cluster-robust" standard errors 3088are produced; see chapter 22 of the Gretl User's Guide for details. 3089 3090By default the pseudo-R-squared statistic suggested by McFadden (1974) is 3091shown, but in the binary case if the --estrella option is given, the variant 3092recommended by Estrella (1998) is shown instead. This variant arguably 3093mimics more closely the properties of the regular R^2 in the context of 3094least-squares estimation. 3095 3096If the dependent variable is not binary but is discrete, then by default it 3097is interpreted as an ordinal response, and Ordered Logit estimates are 3098obtained. However, if the --multinomial option is given, the dependent 3099variable is interpreted as an unordered response, and Multinomial Logit 3100estimates are produced. (In either case, if the variable selected as 3101dependent is not discrete an error is flagged.) In the multinomial case, the 3102accessor $mnlprobs is available after estimation, to get a matrix containing 3103the estimated probabilities of the outcomes at each observation 3104(observations in rows, outcomes in columns). 3105 3106If you want to use logit for analysis of proportions (where the dependent 3107variable is the proportion of cases having a certain characteristic, at each 3108observation, rather than a 1 or 0 variable indicating whether the 3109characteristic is present or not) you should not use the "logit" command, 3110but rather construct the logit variable, as in 3111 3112 series lgt_p = log(p/(1 - p)) 3113 3114and use this as the dependent variable in an OLS regression. See chapter 12 3115of Ramanathan (2002). 3116 3117Menu path: /Model/Limited dependent variable/Logit 3118 3119# logs Transformations 3120 3121Argument: varlist 3122 3123The natural log of each of the series in varlist is obtained and the result 3124stored in a new series with the prefix l_ ("el" underscore). For example, 3125"logs x y" creates the new variables l_x = ln(x) and l_y = ln(y). 3126 3127Menu path: /Add/Logs of selected variables 3128 3129# loop Programming 3130 3131Argument: control 3132Options: --progressive (enable special forms of certain commands) 3133 --verbose (echo commands and show confirmatory messages) 3134Examples: loop 1000 3135 loop 1000 --progressive 3136 loop while essdiff > .00001 3137 loop i=1991..2000 --verbose 3138 loop for (r=-.99; r<=.99; r+=.01) 3139 loop foreach i xlist 3140 See also armaloop.inp, keane.inp 3141 3142This command opens a special mode in which the program accepts commands to 3143be executed repeatedly. You exit the mode of entering loop commands with 3144"endloop": at this point the stacked commands are executed. 3145 3146The parameter "control" may take any of five forms, as shown in the 3147examples: an integer number of times to repeat the commands within the loop; 3148"while" plus a boolean condition; a range of integer values for index 3149variable; "for" plus three expressions in parentheses, separated by 3150semicolons (which emulates the for statement in the C programming language); 3151or "foreach" plus an index variable and a list. 3152 3153See chapter 13 of the Gretl User's Guide for further details and examples. 3154The effect of the --progressive option (which is designed for use in Monte 3155Carlo simulations) is explained there. Not all gretl commands may be used 3156within a loop; the commands available in this context are also set out 3157there. 3158 3159By default, execution of commands proceeds more quietly within loops than in 3160other contexts. If you want more feedback on what's going on in a loop, give 3161the --verbose option. 3162 3163# mahal Statistics 3164 3165Argument: varlist 3166Options: --quiet (don't print anything) 3167 --save (add distances to the dataset) 3168 --vcv (print covariance matrix) 3169 3170Computes the Mahalanobis distances between the series in varlist. The 3171Mahalanobis distance is the distance between two points in a k-dimensional 3172space, scaled by the statistical variation in each dimension of the space. 3173For example, if p and q are two observations on a set of k variables with 3174covariance matrix C, then the Mahalanobis distance between the observations 3175is given by 3176 3177 sqrt((p - q)' * C-inverse * (p - q)) 3178 3179where (p - q) is a k-vector. This reduces to Euclidean distance if the 3180covariance matrix is the identity matrix. 3181 3182The space for which distances are computed is defined by the selected 3183variables. For each observation in the current sample range, the distance is 3184computed between the observation and the centroid of the selected variables. 3185This distance is the multidimensional counterpart of a standard z-score, and 3186can be used to judge whether a given observation "belongs" with a group of 3187other observations. 3188 3189If the --vcv option is given, the covariance matrix and its inverse are 3190printed. If the --save option is given, the distances are saved to the 3191dataset under the name mdist (or mdist1, mdist2 and so on if there is 3192already a variable of that name). 3193 3194Menu path: /View/Mahalanobis distances 3195 3196# makepkg Programming 3197 3198Argument: filename 3199Options: --index (write auxiliary index file) 3200 --translations (write auxiliary strings file) 3201 --quiet (operate quietly) 3202 3203Supports creation of a gretl function package via the command line. The mode 3204of operation of this command depends on the extension of filename, which 3205must be either .gfn or .zip. 3206 3207Gfn mode 3208 3209Writes a gfn file. It is assumed that a package specification file, with the 3210same basename as filename but with the extension .spec, is accessible, along 3211with any auxiliary files that it references. It is also assumed that all the 3212functions to be packaged have been read into memory. 3213 3214Zip mode 3215 3216Writes a zip package file (gfn plus other materials). If a gfn file of the 3217same basename as filename is found, gretl checks for corresponding inp and 3218spec files: if these are both found and at least one of them is newer than 3219the gfn file then the gfn is rebuilt, otherwise the existing gfn is used. If 3220no such file is found, gretl first attempts to build the gfn. 3221 3222Gfn options 3223 3224The option flags support the writing of auxiliary files, intended for use 3225with gretl "addons". The index file is a short XML document containing basic 3226information about the package; it has the same basename as the package and 3227the extension .xml. The translations file contains strings from the package 3228that may be suitable for translation, in C format; for package foo this file 3229is named foo-i18n.c. These files are not produced if the command is 3230operating in zip mode and a pre-existing gfn file is used. 3231 3232For details on all of this, see the gretl Function Package Guide. 3233 3234Menu path: /File/Function packages/New package 3235 3236# markers Dataset 3237 3238Variants: markers --to-file=filename 3239 markers --to-array=name 3240 markers --from-file=filename 3241 markers --delete 3242 3243The options --to-file and --to-array provide means of saving the observation 3244marker strings from the current dataset, either to a named file or a named 3245array. If no such strings are present an error is flagged. In the file case 3246output will be written in the current "workdir" unless the filename string 3247contains a full path specification. The markers are written one per line. In 3248the array case, if name is the identifier of an existing array of strings it 3249will be overwritten, otherwise a new array will be created. 3250 3251With the option --from-file, reads the specified file (which should be plain 3252text) and assigns observation markers to the rows in the dataset, reading 3253one marker per line. In general there should be at least as many markers in 3254the file as observations in the dataset, but if the dataset is a panel it is 3255also acceptable if the number of markers in the file matches the number of 3256cross-sectional units (in which case the markers are repeated for each time 3257period.) 3258 3259The --delete option does what you'd expect: it removes the observation 3260marker strings from the dataset. 3261 3262Menu path: /Data/Observation markers 3263 3264# meantest Tests 3265 3266Arguments: series1 series2 3267Option: --unequal-vars (assume variances are unequal) 3268 3269Calculates the t statistic for the null hypothesis that the population means 3270are equal for the variables series1 and series2, and shows its p-value. 3271 3272By default the test statistic is calculated on the assumption that the 3273variances are equal for the two variables. With the --unequal-vars option 3274the variances are assumed to be different; in this case the degrees of 3275freedom for the test statistic are approximated as per Satterthwaite (1946). 3276 3277Menu path: /Tools/Test statistic calculator 3278 3279# midasreg Estimation 3280 3281Arguments: depvar indepvars ; MIDAS-terms 3282Options: --vcv (print covariance matrix) 3283 --robust (robust standard errors) 3284 --quiet (suppress printing of results) 3285 --levenberg (see below) 3286Examples: midasreg y 0 y(-1) ; mds(X, 1, 9, 1, theta) 3287 midasreg y 0 y(-1) ; mds(X, 1, 9, 0) 3288 midasreg y 0 y(-1) ; mdsl(XL, 2, theta) 3289 See also gdp_midas.inp 3290 3291Carries out least-squares estimation (either NLS or OLS, depending on the 3292specification) of a MIDAS (Mixed Data Sampling) model. Such models include 3293one or more independent variables that are observed at a higher frequency 3294than the dependent variable; for a good brief introduction see Armesto, 3295Engemann and Owyang (2010). 3296 3297The variables in indepvars should be of the same frequency as the dependent 3298variable. This list should usually include const or 0 (intercept) and 3299typically includes one or more lags of the dependent variable. The 3300high-frequency terms are given after a semicolon; each one takes the form of 3301a number of comma-separated arguments within parentheses, prefixed by either 3302mds or mdsl. 3303 3304mds: this variant generally requires 5 arguments, as follows: the name of a 3305"MIDAS list", two integers giving the minimum and maximum high-frequency 3306lags, an integer between 0 and 4 (or string, see below) specifying the type 3307of parameterization to use, and the name of a vector holding initial values 3308of the parameters. The example below calls for lags 3 to 11 of the 3309high-frequency series represented by the list X, using parameterization type 33101 (exponential Almon, see below) with initializer theta. 3311 3312 mds(X, 3, 11, 1, theta) 3313 3314mdsl: generally requires 3 arguments: the name of a list of MIDAS lags, an 3315integer (or string, see below) to specify the type of parameterization and 3316the name of an initialization vector. In this case the minimum and maximum 3317lags are implicit in the initial list argument. In the example below Xlags 3318should be a list which already holds all the required lags; such a list can 3319be constructed using the "hflags" function. 3320 3321 mdsl(XLags, 1, theta) 3322 3323The supported types of parameterization are shown below; in the context of 3324mds and mdsl specifications these may be given in the form of numeric codes 3325or the double-quoted strings shown after the numbers. 3326 33270 or "umidas": unrestricted MIDAS or U-MIDAS (each lag has its own 3328coefficient) 3329 33301 or "nealmon": normalized exponential Almon; requires at least one 3331parameter, commonly uses two 3332 33332 or "beta0": normalized beta with a zero last lag; requires exactly two 3334parameters 3335 33363 or "betan": normalized beta with non-zero last lag; requires exactly three 3337parameters 3338 33394 or "almonp": (non-normalized) Almon polynomial; requires at least one 3340parameter 3341 33425 or "beta1": as beta0, but with the first parameter fixed at 1, leaving a 3343single free parameter. 3344 3345When the parameterization is U-MIDAS, the final initializer argument is not 3346required. In other cases you can request an automatic initialization by 3347substituting one or other of these two forms for the name of an initial 3348parameter vector: 3349 3350 The keyword null: this is accepted if the parameterization has a fixed 3351 number of terms (the beta cases, with 2 or 3 parameters). It's also 3352 accepted for the exponential Almon case, implying the default of 2 3353 parameters. 3354 3355 An integer value giving the required number of parameters. 3356 3357The estimation method used by this command depends on the specification of 3358the high-frequency terms. In the case of U-MIDAS the method is OLS, 3359otherwise it is nonlinear least squares (NLS). When the normalized 3360exponential Almon or normalized beta parameterization is specified, the 3361default NLS method is a combination of constrained BFGS and OLS, but the 3362--levenberg option can be given to force use of the Levenberg-Marquardt 3363algorithm. 3364 3365Menu path: /Model/Univariate time series/MIDAS 3366 3367# mle Estimation 3368 3369Arguments: log-likelihood function [ derivatives ] 3370Options: --quiet (don't show estimated model) 3371 --vcv (print covariance matrix) 3372 --hessian (base covariance matrix on the Hessian) 3373 --robust[=hac] (QML or HAC covariance matrix) 3374 --cluster=clustvar (cluster-robust covariance matrix) 3375 --verbose (print details of iterations) 3376 --no-gradient-check (see below) 3377 --auxiliary (see below) 3378 --lbfgs (use L-BFGS-B instead of regular BFGS) 3379Examples: weibull.inp, biprobit_via_ghk.inp, frontier.inp, keane.inp 3380 3381Performs Maximum Likelihood (ML) estimation using either the BFGS (Broyden, 3382Fletcher, Goldfarb, Shanno) algorithm or Newton's method. The user must 3383specify the log-likelihood function. The parameters of this function must be 3384declared and given starting values prior to estimation. Optionally, the user 3385may specify the derivatives of the log-likelihood function with respect to 3386each of the parameters; if analytical derivatives are not supplied, a 3387numerical approximation is computed. 3388 3389This help text assumes use of the default BFGS maximizer. For information on 3390using Newton's method please see chapter 26 of the Gretl User's Guide. 3391 3392Simple example: Suppose we have a series X with values 0 or 1 and we wish to 3393obtain the maximum likelihood estimate of the probability, p, that X = 1. 3394(In this simple case we can guess in advance that the ML estimate of p will 3395simply equal the proportion of Xs equal to 1 in the sample.) 3396 3397The parameter p must first be added to the dataset and given an initial 3398value. For example, scalar p = 0.5. 3399 3400We then construct the MLE command block: 3401 3402 mle loglik = X*log(p) + (1-X)*log(1-p) 3403 deriv p = X/p - (1-X)/(1-p) 3404 end mle 3405 3406The first line above specifies the log-likelihood function. It starts with 3407the keyword mle, then a dependent variable is specified and an expression 3408for the log-likelihood is given (using the same syntax as in the "genr" 3409command). The next line (which is optional) starts with the keyword deriv 3410and supplies the derivative of the log-likelihood function with respect to 3411the parameter p. If no derivatives are given, you should include a statement 3412using the keyword params which identifies the free parameters: these are 3413listed on one line, separated by spaces and can be either scalars, or 3414vectors, or any combination of the two. For example, the above could be 3415changed to: 3416 3417 mle loglik = X*log(p) + (1-X)*log(1-p) 3418 params p 3419 end mle 3420 3421in which case numerical derivatives would be used. 3422 3423Note that any option flags should be appended to the ending line of the MLE 3424block. For example: 3425 3426 mle loglik = X*log(p) + (1-X)*log(1-p) 3427 params p 3428 end mle --quiet 3429 3430Covariance matrix and standard errors 3431 3432If the log-likelihood function returns a series or vector giving 3433per-observation values then estimated standard errors are by default based 3434on the Outer Product of the Gradient (OPG), while if the --hessian option is 3435given they are instead based on the negative inverse of the Hessian, which 3436is approximated numerically. If the --robust option is given, a QML 3437estimator is used (namely, a sandwich of the negative inverse of the Hessian 3438and the OPG). If the hac parameter is added to this option the OPG is 3439augmented in the manner of Newey and West to allow for serial correlation of 3440the gradient. (This only makes sense with time-series data.) However, if the 3441log-likelihood function just returns a scalar value, the OPG is not 3442available (and so neither is the QML estimator), and standard errors are of 3443necessity computed using the numerical Hessian. 3444 3445In the event that you just want the primary parameter estimates you can give 3446the --auxiliary option, which suppresses computation of the covariance 3447matrix and standard errors; this will save some CPU cycles and memory usage. 3448 3449Checking analytical derivatives 3450 3451If you supply analytical derivatives, by default gretl runs a numerical 3452check on their plausibility. Occasionally this may produce false positives, 3453instances where correct derivatives appear to be wrong and estimation is 3454refused. To counter this, or to achieve a little extra speed, you can give 3455the option --no-gradient-check. Obviously, you should do this only if you 3456are confident that the gradient you have specified is right. 3457 3458Parameter names 3459 3460In estimating a nonlinear model it is often convenient to name the 3461parameters tersely. In printing the results, however, it may be desirable to 3462use more informative labels. This can be achieved via the additional keyword 3463param_names within the command block. For a model with k parameters the 3464argument following this keyword should be a double-quoted string literal 3465holding k space-separated names, the name of a string variable that holds k 3466such names, or the name of an array of k strings. 3467 3468For an in-depth description of "mle" please refer to chapter 26 of the Gretl 3469User's Guide. 3470 3471Menu path: /Model/Maximum likelihood 3472 3473# modeltab Utilities 3474 3475Variants: modeltab add 3476 modeltab show 3477 modeltab free 3478 modeltab --output=filename 3479 3480Manipulates the gretl "model table". See chapter 3 of the Gretl User's Guide 3481for details. The sub-commands have the following effects: "add" adds the 3482last model estimated to the model table, if possible; "show" displays the 3483model table in a window; and "free" clears the table. 3484 3485To call for printing of the model table, use the flag --output= plus a 3486filename. If the filename has the suffix ".tex", the output will be in TeX 3487format; if the suffix is ".rtf" the output will be RTF; otherwise it will be 3488plain text. In the case of TeX output the default is to produce a 3489"fragment", suitable for inclusion in a document; if you want a stand-alone 3490document instead, use the --complete option, for example 3491 3492 modeltab --output="myfile.tex" --complete 3493 3494Menu path: Session icon window, Model table icon 3495 3496# modprint Printing 3497 3498Arguments: coeffmat names [ addstats ] 3499Option: --output=filename (send output to specified file) 3500 3501Prints the coefficient table and optional additional statistics for a model 3502estimated "by hand". Mainly useful for user-written functions. 3503 3504The argument coeffmat should be a k by 2 matrix containing k coefficients 3505and k associated standard errors. The names argument should supply at least 3506k names for labeling the coefficients; it can take the form of a string 3507literal (in double quotes) or string variable, in which case the names 3508should be separated by commas or spaces, or it may be given as a named array 3509of strings. 3510 3511The optional argument addstats is a vector containing p additional 3512statistics to be printed under the coefficient table. If this argument is 3513given, then names should contain k + p names, the additional p names to be 3514associated with the extra statistics. 3515 3516If addstats is not provided and the coeffmat matrix has row names attached, 3517then the names argument can be omitted. 3518 3519To put the output into a file, use the flag --output= plus a filename. If 3520the filename has the suffix ".tex", the output will be in TeX format; if the 3521suffix is ".rtf" the output will be RTF; otherwise it will be plain text. In 3522the case of TeX output the default is to produce a "fragment", suitable for 3523inclusion in a document; if you want a stand-alone document instead, use the 3524--complete option. 3525 3526The output file will be written in the currently set "workdir", unless the 3527filename string contains a full path specification. 3528 3529# modtest Tests 3530 3531Argument: [ order ] 3532Options: --normality (normality of residual) 3533 --logs (nonlinearity, logs) 3534 --squares (nonlinearity, squares) 3535 --autocorr (serial correlation) 3536 --arch (ARCH) 3537 --white (heteroskedasticity, White's test) 3538 --white-nocross (White's test, squares only) 3539 --breusch-pagan (heteroskedasticity, Breusch-Pagan) 3540 --robust (robust variance estimate for Breusch-Pagan) 3541 --panel (heteroskedasticity, groupwise) 3542 --comfac (common factor restriction, AR1 models only) 3543 --xdepend (cross-sectional dependence, panel data only) 3544 --quiet (don't print details) 3545 --silent (don't print anything) 3546Examples: credscore.inp 3547 3548Must immediately follow an estimation command. The discussion below applies 3549to usage of the command following estimation of a single-equation model; see 3550chapter 32 of the Gretl User's Guide for an account of how "modtest" 3551operates after estimation of a VAR. 3552 3553Depending on the option given, this command carries out one of the 3554following: the Doornik-Hansen test for the normality of the error term; a 3555Lagrange Multiplier test for nonlinearity (logs or squares); White's test 3556(with or without cross-products) or the Breusch-Pagan test (Breusch and 3557Pagan, 1979) for heteroskedasticity; the LMF test for serial correlation 3558(Kiviet, 1986); a test for ARCH (Autoregressive Conditional 3559Heteroskedasticity; see also the "arch" command); a test of the common 3560factor restriction implied by AR(1) estimation; or a test for 3561cross-sectional dependence in panel-data models. With the exception of the 3562normality, common factor and cross-sectional dependence tests most of the 3563options are only available for models estimated via OLS, but see below for 3564details regarding two-stage least squares. 3565 3566The optional order argument is relevant only in case the --autocorr or 3567--arch options are selected. The default is to run these tests using a lag 3568order equal to the periodicity of the data, but this can be adjusted by 3569supplying a specific lag order. 3570 3571The --robust option applies only when the Breusch-Pagan test is selected; 3572its effect is to use the robust variance estimator proposed by Koenker 3573(1981), making the test less sensitive to the assumption of normality. 3574 3575The --panel option is available only when the model is estimated on panel 3576data: in this case a test for groupwise heteroskedasticity is performed 3577(that is, for a differing error variance across the cross-sectional units). 3578 3579The --comfac option is available only when the model is estimated via an 3580AR(1) method such as Hildreth-Lu. The auxiliary regression takes the form of 3581a relatively unrestricted dynamic model, which is used to test the common 3582factor restriction implicit in the AR(1) specification. 3583 3584The --xdepend option is available only for models estimated on panel data. 3585The test statistic is that developed by Pesaran (2004). The null hypothesis 3586is that the error term is independently distributed across the 3587cross-sectional units or individuals. 3588 3589By default, the program prints the auxiliary regression on which the test 3590statistic is based, where applicable. This may be suppressed by using the 3591--quiet flag (minimal printed output) or the --silent flag (no printed 3592output). The test statistic and its p-value may be retrieved using the 3593accessors "$test" and "$pvalue" respectively. 3594 3595When a model has been estimated by two-stage least squares (see "tsls"), the 3596LM principle breaks down and gretl offers some equivalents: the --autocorr 3597option computes Godfrey's test for autocorrelation (Godfrey, 1994) while the 3598--white option yields the HET1 heteroskedasticity test (Pesaran and Taylor, 35991999). 3600 3601For additional diagnostic tests on models, see "chow", "cusum", "reset" and 3602"qlrtest". 3603 3604Menu path: Model window, /Tests 3605 3606# mpols Estimation 3607 3608Arguments: depvar indepvars 3609Options: --vcv (print covariance matrix) 3610 --simple-print (do not print auxiliary statistics) 3611 --quiet (suppress printing of results) 3612 3613Computes OLS estimates for the specified model using multiple precision 3614floating-point arithmetic, with the help of the Gnu Multiple Precision (GMP) 3615library. By default 256 bits of precision are used for the calculations, but 3616this can be increased via the environment variable GRETL_MP_BITS. For 3617example, when using the bash shell one could issue the following command, 3618before starting gretl, to set a precision of 1024 bits. 3619 3620 export GRETL_MP_BITS=1024 3621 3622A rather arcane option is available for this command (primarily for testing 3623purposes): if the indepvars list is followed by a semicolon and a further 3624list of numbers, those numbers are taken as powers of x to be added to the 3625regression, where x is the last variable in indepvars. These additional 3626terms are computed and stored in multiple precision. In the following 3627example y is regressed on x and the second, third and fourth powers of x: 3628 3629 mpols y 0 x ; 2 3 4 3630 3631Menu path: /Model/Other linear models/High precision OLS 3632 3633# negbin Estimation 3634 3635Arguments: depvar indepvars [ ; offset ] 3636Options: --model1 (use NegBin 1 model) 3637 --robust (QML covariance matrix) 3638 --cluster=clustvar (see "logit" for explanation) 3639 --opg (see below) 3640 --vcv (print covariance matrix) 3641 --verbose (print details of iterations) 3642 --quiet (don't print results) 3643Examples: camtriv.inp 3644 3645Estimates a Negative Binomial model. The dependent variable is taken to 3646represent a count of the occurrence of events of some sort, and must have 3647only non-negative integer values. By default the model NegBin 2 is used, in 3648which the conditional variance of the count is given by mu(1 + αmu), where 3649mu denotes the conditional mean. But if the --model1 option is given the 3650conditional variance is mu(1 + α). 3651 3652The optional offset series works in the same way as for the "poisson" 3653command. The Poisson model is a restricted form of the Negative Binomial in 3654which α = 0 by construction. 3655 3656By default, standard errors are computed using a numerical approximation to 3657the Hessian at convergence. But if the --opg option is given the covariance 3658matrix is based on the Outer Product of the Gradient (OPG), or if the 3659--robust option is given QML standard errors are calculated, using a 3660"sandwich" of the inverse of the Hessian and the OPG. 3661 3662Menu path: /Model/Limited dependent variable/Count data 3663 3664# nls Estimation 3665 3666Arguments: function [ derivatives ] 3667Options: --quiet (don't show estimated model) 3668 --robust (robust standard errors) 3669 --vcv (print covariance matrix) 3670 --verbose (print details of iterations) 3671 --no-gradient-check (see below) 3672Examples: wg_nls.inp, ects_nls.inp 3673 3674Performs Nonlinear Least Squares (NLS) estimation using a modified version 3675of the Levenberg-Marquardt algorithm. You must supply a function 3676specification. The parameters of this function must be declared and given 3677starting values prior to estimation. Optionally, you may specify the 3678derivatives of the regression function with respect to each of the 3679parameters. If you do not supply derivatives you should instead give a list 3680of the parameters to be estimated (separated by spaces or commas), preceded 3681by the keyword params. In the latter case a numerical approximation to the 3682Jacobian is computed. 3683 3684It is easiest to show what is required by example. The following is a 3685complete script to estimate the nonlinear consumption function set out in 3686William Greene's Econometric Analysis (Chapter 11 of the 4th edition, or 3687Chapter 9 of the 5th). The numbers to the left of the lines are for 3688reference and are not part of the commands. Note that any option flags, such 3689as --vcv for printing the covariance matrix of the parameter estimates, 3690should be appended to the final command, end nls. 3691 3692 1 open greene11_3.gdt 3693 2 ols C 0 Y 3694 3 scalar a = $coeff(0) 3695 4 scalar b = $coeff(Y) 3696 5 scalar g = 1.0 3697 6 nls C = a + b * Y^g 3698 7 deriv a = 1 3699 8 deriv b = Y^g 3700 9 deriv g = b * Y^g * log(Y) 3701 10 end nls --vcv 3702 3703It is often convenient to initialize the parameters by reference to a 3704related linear model; that is accomplished here on lines 2 to 5. The 3705parameters alpha, beta and gamma could be set to any initial values (not 3706necessarily based on a model estimated with OLS), although convergence of 3707the NLS procedure is not guaranteed for an arbitrary starting point. 3708 3709The actual NLS commands occupy lines 6 to 10. On line 6 the "nls" command is 3710given: a dependent variable is specified, followed by an equals sign, 3711followed by a function specification. The syntax for the expression on the 3712right is the same as that for the "genr" command. The next three lines 3713specify the derivatives of the regression function with respect to each of 3714the parameters in turn. Each line begins with the keyword "deriv", gives the 3715name of a parameter, an equals sign, and an expression whereby the 3716derivative can be calculated. As an alternative to supplying analytical 3717derivatives, you could substitute the following for lines 7 to 9: 3718 3719 params a b g 3720 3721Line 10, "end nls", completes the command and calls for estimation. Any 3722options should be appended to this line. 3723 3724If you supply analytical derivatives, by default gretl runs a numerical 3725check on their plausibility. Occasionally this may produce false positives, 3726instances where correct derivatives appear to be wrong and estimation is 3727refused. To counter this, or to achieve a little extra speed, you can give 3728the option --no-gradient-check. Obviously, you should do this only if you 3729are confident that the gradient you have specified is right. 3730 3731Parameter names 3732 3733In estimating a nonlinear model it is often convenient to name the 3734parameters tersely. In printing the results, however, it may be desirable to 3735use more informative labels. This can be achieved via the additional keyword 3736param_names within the command block. For a model with k parameters the 3737argument following this keyword should be a double-quoted string literal 3738holding k space-separated names, the name of a string variable that holds k 3739such names, or the name of an array of k strings. 3740 3741For further details on NLS estimation please see chapter 25 of the Gretl 3742User's Guide. 3743 3744Menu path: /Model/Nonlinear Least Squares 3745 3746# normtest Tests 3747 3748Argument: series 3749Options: --dhansen (Doornik-Hansen test, the default) 3750 --swilk (Shapiro-Wilk test) 3751 --lillie (Lilliefors test) 3752 --jbera (Jarque-Bera test) 3753 --all (do all tests) 3754 --quiet (suppress printed output) 3755 3756Carries out a test for normality for the given series. The specific test is 3757controlled by the option flags (but if no flag is given, the Doornik-Hansen 3758test is performed). Note: the Doornik-Hansen and Shapiro-Wilk tests are 3759recommended over the others, on account of their superior small-sample 3760properties. 3761 3762The test statistic and its p-value may be retrieved using the accessors 3763"$test" and "$pvalue". Please note that if the --all option is given, the 3764result recorded is that from the Doornik-Hansen test. 3765 3766Menu path: /Variable/Normality test 3767 3768# nulldata Dataset 3769 3770Argument: series_length 3771Option: --preserve (preserve variables other than series) 3772Example: nulldata 500 3773 3774Establishes a "blank" data set, containing only a constant and an index 3775variable, with periodicity 1 and the specified number of observations. This 3776may be used for simulation purposes: functions such as "uniform()" and 3777"normal()" will generate artificial series from scratch to fill out the data 3778set. This command may be useful in conjunction with "loop". See also the 3779"seed" option to the "set" command. 3780 3781By default, this command cleans out all data in gretl's current workspace: 3782not only series but also matrices, scalars, strings, etc. If you give the 3783--preserve option, however, any currently defined variables other than 3784series are retained. 3785 3786Menu path: /File/New data set 3787 3788# ols Estimation 3789 3790Arguments: depvar indepvars 3791Options: --vcv (print covariance matrix) 3792 --robust (robust standard errors) 3793 --cluster=clustvar (clustered standard errors) 3794 --jackknife (see below) 3795 --simple-print (do not print auxiliary statistics) 3796 --quiet (suppress printing of results) 3797 --anova (print an ANOVA table) 3798 --no-df-corr (suppress degrees of freedom correction) 3799 --print-final (see below) 3800Examples: ols 1 0 2 4 6 7 3801 ols y 0 x1 x2 x3 --vcv 3802 ols y 0 x1 x2 x3 --quiet 3803 3804Computes ordinary least squares (OLS) estimates with depvar as the dependent 3805variable and indepvars as the list of independent variables. Variables may 3806be specified by name or number; use the number zero for a constant term. 3807 3808Besides coefficient estimates and standard errors, the program also prints 3809p-values for t (two-tailed) and F-statistics. A p-value below 0.01 indicates 3810statistical significance at the 1 percent level and is marked with ***. ** 3811indicates significance between 1 and 5 percent and * indicates significance 3812between the 5 and 10 percent levels. Model selection statistics (the Akaike 3813Information Criterion or AIC and Schwarz's Bayesian Information Criterion) 3814are also printed. The formula used for the AIC is that given by Akaike 3815(1974), namely minus two times the maximized log-likelihood plus two times 3816the number of parameters estimated. 3817 3818If the option --no-df-corr is given, the usual degrees of freedom correction 3819is not applied when calculating the estimated error variance (and hence also 3820the standard errors of the parameter estimates). 3821 3822The option --print-final is applicable only in the context of a "loop". It 3823arranges for the regression to be run silently on all but the final 3824iteration of the loop. See chapter 13 of the Gretl User's Guide for details. 3825 3826Various internal variables may be retrieved following estimation. For 3827example 3828 3829 series uh = $uhat 3830 3831saves the residuals under the name uh. See the "accessors" section of the 3832gretl function reference for details. 3833 3834The specific formula ("HC" version) used for generating robust standard 3835errors when the --robust option is given can be adjusted via the "set" 3836command. The --jackknife option has the effect of selecting an hc_version of 38373a. The --cluster overrides the selection of HC version, and produces robust 3838standard errors by grouping the observations by the distinct values of 3839clustvar; see chapter 22 of the Gretl User's Guide for details. 3840 3841Menu path: /Model/Ordinary Least Squares 3842Other access: Beta-hat button on toolbar 3843 3844# omit Tests 3845 3846Argument: varlist 3847Options: --test-only (don't replace the current model) 3848 --chi-square (give chi-square form of Wald test) 3849 --quiet (print only the basic test result) 3850 --silent (don't print anything) 3851 --vcv (print covariance matrix for reduced model) 3852 --auto[=alpha] (sequential elimination, see below) 3853Examples: omit 5 7 9 3854 omit seasonals --quiet 3855 omit --auto 3856 omit --auto=0.05 3857 See also restrict.inp, sw_ch12.inp, sw_ch14.inp 3858 3859This command must follow an estimation command. In its primary form, it 3860calculates a Wald test for the joint significance of the variables in 3861varlist, which should be a subset (though not necessarily a proper subset) 3862of the independent variables in the model last estimated. The results of the 3863test may be retrieved using the accessors "$test" and "$pvalue". 3864 3865Unless the restriction removes all the original regressors, by default the 3866restricted model is estimated and it replaces the original as the "current 3867model" for the purposes of, for example, retrieving the residuals as $uhat 3868or doing further tests. This behavior may be suppressed via the --test-only 3869option. 3870 3871By default the F-form of the Wald test is recorded; the --chi-square option 3872may be used to record the chi-square form instead. 3873 3874If the restricted model is both estimated and printed, the --vcv option has 3875the effect of printing its covariance matrix, otherwise this option is 3876ignored. 3877 3878Alternatively, if the --auto flag is given, sequential elimination is 3879performed: at each step the variable with the highest p-value is omitted, 3880until all remaining variables have a p-value no greater than some cutoff. 3881The default cutoff is 10 percent (two-sided); this can be adjusted by 3882appending "=" and a value between 0 and 1 (with no spaces), as in the fourth 3883example above. If varlist is given this process is confined to the listed 3884variables, otherwise all regressors aside from the constant are treated as 3885candidates for omission. Note that the --auto and --test-only options cannot 3886be combined. 3887 3888Menu path: Model window, /Tests/Omit variables 3889 3890# open Dataset 3891 3892Argument: filename 3893Options: --quiet (don't print list of series) 3894 --preserve (preserve variables other than series) 3895 --select=selection (read only the specified series, see below) 3896 --frompkg=pkgname (see below) 3897 --all-cols (see below) 3898 --www (use a database on the gretl server) 3899 --odbc (use an ODBC database) 3900 See below for additional specialized options 3901Examples: open data4-1 3902 open voter.dta 3903 open fedbog.bin --www 3904 open dbnomics 3905 3906Opens a data file or database -- see chapter 4 of the Gretl User's Guide for 3907an explanation of this distinction. The effect is somewhat different in the 3908two cases. When a data file is opened, its content is read into gretl's 3909workspace, replacing the current dataset (if any). To add data to the 3910current dataset instead of replacing, see "append" or (for greater 3911flexibility) "join". When a database is opened this does not immediately 3912load any data; rather, it sets the source for subsequent invocations of the 3913"data" command, which is used to import selected series. For specifics 3914regarding databases see the section headed "Opening a database" below. 3915 3916If filename is not given as a full path, gretl will search some relevant 3917paths to try to find the file, with "workdir" as a first choice. If no 3918filename suffix is given (as in the first example above), gretl assumes a 3919native datafile with suffix .gdt. Based on the name of the file and various 3920heuristics, gretl will try to detect the format of the data file (native, 3921plain text, CSV, MS Excel, Stata, SPSS, etc.). 3922 3923If the --frompkg option is used, gretl will look for the specified data file 3924in the subdirectory associated with the function package specified by 3925pkgname. 3926 3927If the filename argument takes the form of a URI starting with http:// or 3928https://, then gretl will attempt to download the indicated data file before 3929opening it. 3930 3931By default, opening a new data file clears the current gretl session, which 3932includes deletion of all named variables, including matrices, scalars and 3933strings. If you wish to keep your currently defined variables (other than 3934series, which are necessarily cleared out), use the --preserve option. 3935 3936Spreadsheet files 3937 3938When opening a data file in a spreadsheet format (Gnumeric, Open Document or 3939MS Excel), you may give up to three additional parameters following the 3940filename. First, you can select a particular worksheet within the file. This 3941is done either by giving its (1-based) number, using the syntax, e.g., 3942--sheet=2, or, if you know the name of the sheet, by giving the name in 3943double quotes, as in --sheet="MacroData". The default is to read the first 3944worksheet. You can also specify a column and/or row offset into the 3945worksheet via, e.g., 3946 3947 --coloffset=3 --rowoffset=2 3948 3949which would cause gretl to ignore the first 3 columns and the first 2 rows. 3950The default is an offset of 0 in both dimensions, that is, to start reading 3951at the top-left cell. 3952 3953Delimited text files 3954 3955With plain text files, gretl generally expects to find the data columns 3956delimited in some standard manner (generally via comma, tab, space or 3957semicolon). By default gretl looks for observation labels or dates in the 3958first column if its heading is empty or is a suggestive string such as 3959"year", "date" or "obs". You can prevent gretl from treating the first 3960column specially by giving the --all-cols option. 3961 3962Fixed format text 3963 3964A "fixed format" text data file is one without column delimiters, but in 3965which the data are laid out according to a known set of specifications such 3966as "variable k occupies 8 columns starting at column 24". To read such 3967files, you should append a string --fixed-cols=colspec, where colspec is 3968composed of comma-separated integers. These integers are interpreted as a 3969set of pairs. The first element of each pair denotes a starting column, 3970measured in bytes from the beginning of the line with 1 indicating the first 3971byte; and the second element indicates how many bytes should be read for the 3972given field. So, for example, if you say 3973 3974 open fixed.txt --fixed-cols=1,6,20,3 3975 3976then for variable 1 gretl will read 6 bytes starting at column 1; and for 3977variable 2, 3 bytes starting at column 20. Lines that are blank, or that 3978begin with #, are ignored, but otherwise the column-reading template is 3979applied, and if anything other than a valid numerical value is found an 3980error is flagged. If the data are read successfully, the variables will be 3981named v1, v2, etc. It's up to the user to provide meaningful names and/or 3982descriptions using the commands "rename" and/or "setinfo". 3983 3984By default, when you import a file that contains string-valued series, a 3985text box will open showing you the contents of string_table.txt, a file 3986which contains the mapping between strings and their numeric coding. You can 3987suppress this behavior via the --quiet option. 3988 3989Loading selected series 3990 3991Use of open with a data file argument (as opposed to the database case, see 3992below) generally implies loading all series from the specified file. 3993However, in the case of native gretl files (gdt and gdtb) only, it is 3994possible to specify by name a subset of series to load. This is done via the 3995--select option, which requires an accompanying argument in one of three 3996forms: the name of a single series; a list of names, separated by spaces and 3997enclosed in double quotes; or the name of an array of strings. Examples: 3998 3999 # single series 4000 open somefile.gdt --select=x1 4001 # more than one series 4002 open somefile.gdt --select="x1 x5 x27" 4003 # alternative method 4004 strings Sel = defarray("x1", "x5", "x27") 4005 open somefile.gdt --select=Sel 4006 4007Opening a database 4008 4009As mentioned above, the open command can be used to open a database file for 4010subsequent reading via the "data" command. Supported file-types are native 4011gretl databases, RATS 4.0 and PcGive. 4012 4013Besides reading a file of one of these types on the local machine, three 4014further cases are supported. First, if the --www option is given, gretl will 4015try to access a native gretl database of the given name on the gretl server 4016-- for instance the Federal Reserve interest rates database fedbog.bin in 4017the third example shown above. Second, the command "open dbnomics" can be 4018used to set DB.NOMICS as the source for database reads; on this see dbnomics 4019for gretl. Third, if the --odbc option is given gretl will try to access an 4020ODBC database. This option is explained at length in chapter 42 of the Gretl 4021User's Guide. 4022 4023Menu path: /File/Open data 4024Other access: Drag a data file onto gretl's main window 4025 4026# orthdev Transformations 4027 4028Argument: varlist 4029 4030Applicable with panel data only. A series of forward orthogonal deviations 4031is obtained for each variable in varlist and stored in a new variable with 4032the prefix o_. Thus "orthdev x y" creates the new variables o_x and o_y. 4033 4034The values are stored one step ahead of their true temporal location (that 4035is, o_x at observation t holds the deviation that, strictly speaking, 4036belongs at t - 1). This is for compatibility with first differences: one 4037loses the first observation in each time series, not the last. 4038 4039# outfile Printing 4040 4041Variants: outfile filename 4042 outfile --buffer=strvar 4043 outfile --tempfile=strvar 4044Options: --append (append to file, first variant only) 4045 --quiet (see below) 4046 --buffer (see below) 4047 --tempfile (see below) 4048 4049The outfile command starts a block in which any printed output is diverted 4050to a file or buffer (or just discarded, if you wish). Such a block is 4051terminated by the command "end outfile", after which output reverts to the 4052default stream. 4053 4054Diversion to a named file 4055 4056The first variant shown above sends output to a file named by the filename 4057argument. By default a new file is created (or an existing one is 4058overwritten). The output file will be written in the currently set 4059"workdir", unless the filename string contains a full path specification to 4060the contrary. If you wish to append output to an existing file instead, use 4061the --append flag. 4062 4063Some special variations on this theme are available. If you give the keyword 4064null in place of a real filename the effect is to suppress all printed 4065output until redirection is ended. If either of the keywords stdout or 4066stderr are given in place of a regular filename the effect is to redirect 4067output to standard output or standard error output respectively. 4068 4069A simple example follows, where the output from a particular regression is 4070written to a named file. 4071 4072 open data4-10 4073 outfile regress.txt 4074 ols ENROLL 0 CATHOL INCOME COLLEGE 4075 end outfile 4076 4077Diversion to a string buffer 4078 4079The --buffer option is used to store output in a string variable. The 4080required parameter for this option must be the name of an existing string 4081variable, whose content will be over-written. We show below the example 4082given above, revised to write to a string. In this case printing model_out 4083will display the redirected output. 4084 4085 open data4-10 4086 string model_out = "" 4087 outfile --buffer=model_out 4088 ols ENROLL 0 CATHOL INCOME COLLEGE 4089 end outfile 4090 print model_out 4091 4092Diversion to a temporary file 4093 4094The --tempfile option is used to direct output to a temporary file, with an 4095automatically constructed name that is guaranteed to be unique, in the 4096user's "dot" directory. As in the redirection to buffer case, the option 4097parameter should be the name of a string variable: in this case its content 4098is over-written with the name of the temporary file. Please note: files 4099written to the dot directory are cleaned up on exit from the program, so 4100don't use this form is you want the output to be preserved after your gretl 4101session. 4102 4103We repeat the simple example from above, with a couple of extra lines to 4104illustrate the points that strvar tells you where the output went, and you 4105can retrieve it using the "readfile" function. 4106 4107 open data4-10 4108 string mytemp 4109 outfile --tempfile=mytemp 4110 ols ENROLL 0 CATHOL INCOME COLLEGE 4111 end outfile 4112 printf "Output went to %s\n", mytemp 4113 printf "The output was:\n%s\n", readfile(mytemp) 4114 4115Quietness 4116 4117The effect of the --quiet option is to turn off the echoing of commands and 4118the printing of auxiliary messages while output is redirected. It is 4119equivalent to doing 4120 4121 set echo off 4122 set messages off 4123 4124except that when redirection is ended the original values of the echo and 4125messages variables are restored. This option is available in all cases. 4126 4127Levels of redirection 4128 4129In general only one file can be opened in this way at any given time, so 4130calls to this command cannot be nested. However, use of this command is 4131permitted inside user-defined functions (provided the output file is also 4132closed from inside the same function) such that output can be temporarily 4133diverted and then given back to an original output file, in case outfile is 4134currently in use by the caller. For example, the code 4135 4136 function void f (string s) 4137 outfile inner.txt 4138 print s 4139 end outfile 4140 end function 4141 4142 outfile outer.txt --quiet 4143 print "Outside" 4144 f("Inside") 4145 print "Outside again" 4146 end outfile 4147 4148will produce a file called "outer.txt" containing the two lines 4149 4150 Outside 4151 Outside again 4152 4153and a file called "inner.txt" containing the line 4154 4155 Inside 4156 4157# panel Estimation 4158 4159Arguments: depvar indepvars 4160Options: --vcv (print covariance matrix) 4161 --fixed-effects (estimate with group fixed effects) 4162 --random-effects (random effects or GLS model) 4163 --nerlove (use the Nerlove transformation) 4164 --pooled (estimate via pooled OLS) 4165 --between (estimate the between-groups model) 4166 --robust (robust standard errors; see below) 4167 --time-dummies (include time dummy variables) 4168 --unit-weights (weighted least squares) 4169 --iterate (iterative estimation) 4170 --matrix-diff (compute Hausman test via matrix difference) 4171 --unbalanced=method (random effects only, see below) 4172 --quiet (less verbose output) 4173 --verbose (more verbose output) 4174Examples: penngrow.inp 4175 4176Estimates a panel model. By default the fixed effects estimator is used; 4177this is implemented by subtracting the group or unit means from the original 4178data. 4179 4180If the --random-effects flag is given, random effects estimates are 4181computed, by default using the method of Swamy and Arora (1972). In this 4182case (only) the option --matrix-diff forces use of the matrix-difference 4183method (as opposed to the regression method) for carrying out the Hausman 4184test for the consistency of the random effects estimator. Also specific to 4185the random effects estimator is the --nerlove flag, which selects the method 4186of Nerlove (1971) as opposed to Swamy and Arora. 4187 4188Alternatively, if the --unit-weights flag is given, the model is estimated 4189via weighted least squares, with the weights based on the residual variance 4190for the respective cross-sectional units in the sample. In this case (only) 4191the --iterate flag may be added to produce iterative estimates: if the 4192iteration converges, the resulting estimates are Maximum Likelihood. 4193 4194As a further alternative, if the --between flag is given, the between-groups 4195model is estimated (that is, an OLS regression using the group means). 4196 4197The default means of calculating robust standard errors in panel-data models 4198is the Arellano HAC estimator, but Beck-Katz "Panel Corrected Standard 4199Errors" can be selected via the command set pcse on. When the robust option 4200is specified the joint F test on the fixed effects is performed using the 4201robust method of Welch (1951). 4202 4203The --unbalanced option is available only for random effects models: it can 4204be used to choose an ANOVA method for use with an unbalanced panel. By 4205default gretl uses the Swamy-Arora method as for balanced panels, except 4206that the harmonic mean of the individual time-series lengths is used in 4207place of a common T. Under this option you can specify either bc, to use the 4208method of Baltagi and Chang (1994), or stata, to emulate the sa option to 4209the xtreg command in Stata. 4210 4211For more details on panel estimation, please see chapter 23 of the Gretl 4212User's Guide. 4213 4214Menu path: /Model/Panel 4215 4216# panplot Graphs 4217 4218Argument: plotvar 4219Options: --means (time series, group means) 4220 --overlay (plot per group, overlaid, N <= 130) 4221 --sequence (plot per group, in sequence, N <= 130) 4222 --grid (plot per group, in grid, N <= 16) 4223 --stack (plot per group, stacked, N <= 6) 4224 --boxplots (boxplot per group, in sequence, N <= 150) 4225 --boxplot (single boxplot, all groups) 4226 --output=filename (send output to specified file) 4227Examples: panplot x --overlay 4228 panplot x --means --output=display 4229 4230Graphing command specific to panel data: the series plotvar is plotted in a 4231mode specified by one or other of the options. 4232 4233Apart from the --means and --boxplot options the plot explicitly represents 4234variation in both the time-series and cross-sectional dimensions. Such plots 4235are limited in respect of the number of groups (also known as individuals or 4236units) in the current sample range of the panel. For example, the --overlay 4237option, which shows a time series for each group in a single plot, is 4238available only when the number of groups, N, is 130 or less. (Otherwise the 4239graphic becomes too dense to be informative.) If a panel is too large to 4240permit the desired plot specification one can select a reduced range of 4241groups or units temporarily, as in 4242 4243 smpl 1 100 --unit 4244 panplot x --overlay 4245 smpl full 4246 4247The --output=filename option can be used to control the form and destination 4248of the output; see the "gnuplot" command for details. 4249 4250Other access: Main window pop-up menu (single selection) 4251 4252# panspec Tests 4253 4254Options: --nerlove (use Nerlove method for random effects) 4255 --matrix_diff (use matrix-difference method for Hausman test) 4256 --quiet (Suppress printed output) 4257 4258This command is available only after estimating a panel-data model via OLS. 4259It tests the simple pooled specification against the most common 4260alternatives, fixed effects and random effects. 4261 4262The fixed effects specification allows the intercept of the regression to 4263vary across the cross-sectional units. A Wald F-test is reported for the 4264null hypotheses that the intercepts do not differ. The random effects 4265specification decomposes the residual variance into two parts, one part 4266specific to the cross-sectional unit and the other specific to the 4267particular observation. (This estimator can be computed only if the number 4268of cross-sectional units in the data set exceeds the number of parameters to 4269be estimated.) The Breusch-Pagan LM statistic tests the null hypothesis that 4270pooled OLS is adequate against the random effects alternative. 4271 4272Pooled OLS may be rejected against both of the alternatives. Provided the 4273unit- or group-specific error is uncorrelated with the independent 4274variables, the random effects estimator is more efficient than fixed 4275effects; otherwise the random effects estimator is inconsistent and fixed 4276effects are to be preferred. The null hypothesis for the Hausman test is 4277that the group-specific error is not so correlated (and therefore the random 4278effects estimator is preferable). A low p-value for this test counts against 4279random effects and in favor of fixed effects. 4280 4281The first two options for this command pertain to random effects estimation. 4282By default the method of Swamy and Arora is used, and the Hausman test 4283statistic is calculated using the regression method. The options enable the 4284use of Nerlove's alternative variance estimator, and/or the 4285matrix-difference approach to the Hausman statistic. 4286 4287On successful completion the accessors "$test" and "$pvalue" retrieve 42883-vectors holding test statistics and p-values for the three tests noted 4289above: poolability (Wald), poolability (Breusch-Pagan), and Hausman. If you 4290just want the results in this form you can give the --quiet option to skip 4291printed output. 4292 4293Note that after estimating the random effects specification via the "panel" 4294command, the Hausman test is automatically carried out and the results can 4295be retrieved via the "$hausman" accessor. 4296 4297Menu path: Model window, /Tests/Panel specification 4298 4299# pca Statistics 4300 4301Argument: varlist 4302Options: --covariance (use the covariance matrix) 4303 --save[=n] (save major components) 4304 --save-all (save all components) 4305 --quiet (don't print results) 4306 4307Principal Components Analysis. Unless the --quiet option is given, prints 4308the eigenvalues of the correlation matrix (or the covariance matrix if the 4309--covariance option is given) for the variables in varlist, along with the 4310proportion of the joint variance accounted for by each component. Also 4311prints the corresponding eigenvectors or "component loadings". 4312 4313If you give the --save-all option then all components are saved to the 4314dataset as series, with names PC1, PC2 and so on. These artificial variables 4315are formed as the sum of (component loading) times (standardized X_i), where 4316X_i denotes the ith variable in varlist. 4317 4318If you give the --save option without a parameter value, components with 4319eigenvalues greater than the mean (which means greater than 1.0 if the 4320analysis is based on the correlation matrix) are saved to the dataset as 4321described above. If you provide a value for n with this option then the most 4322important n components are saved. 4323 4324See also the "princomp" function. 4325 4326Menu path: /View/Principal components 4327 4328# pergm Statistics 4329 4330Arguments: series [ bandwidth ] 4331Options: --bartlett (use Bartlett lag window) 4332 --log (use log scale) 4333 --radians (show frequency in radians) 4334 --degrees (show frequency in degrees) 4335 --plot=mode-or-filename (see below) 4336 4337Computes and displays the spectrum of the specified series. By default the 4338sample periodogram is given, but optionally a Bartlett lag window is used in 4339estimating the spectrum (see, for example, Greene's Econometric Analysis for 4340a discussion of this). The default width of the Bartlett window is twice the 4341square root of the sample size but this can be set manually using the 4342bandwidth parameter, up to a maximum of half the sample size. 4343 4344If the --log option is given the spectrum is represented on a logarithmic 4345scale. 4346 4347The (mutually exclusive) options --radians and --degrees influence the 4348appearance of the frequency axis when the periodogram is graphed. By default 4349the frequency is scaled by the number of periods in the sample, but these 4350options cause the axis to be labeled from 0 to pi radians or from 0 to 4351180degrees, respectively. 4352 4353By default, if the program is not in batch mode a plot of the periodogram is 4354shown. This can be adjusted via the --plot option. The acceptable parameters 4355to this option are none (to suppress the plot); display (to display a plot 4356even when in batch mode); or a file name. The effect of providing a file 4357name is as described for the --output option of the "gnuplot" command. 4358 4359Menu path: /Variable/Periodogram 4360Other access: Main window pop-up menu (single selection) 4361 4362# pkg Utilities 4363 4364Arguments: action pkgname 4365Options: --local (install from local file) 4366 --quiet (see below) 4367 --verbose (see below) 4368Examples: pkg install armax 4369 pkg install /path/to/myfile.gfn --local 4370 pkg query ghosts 4371 pkg unload armax 4372 4373This command provides a means of installing, unloading, querying or deleting 4374gretl function packages. The action argument must be one of install, query, 4375unload, remove or index. 4376 4377install: In the most basic form, with no option flag and the pkgname 4378argument given as the "plain" name of a gretl function package (as in the 4379first example above), the effect is to download the specified package from 4380the gretl server (unless pkgname starts with http://) and install it on the 4381local machine. In this case it is not necessary to supply a filename 4382extension. If the --local option is given, however, pkgname should be the 4383path to an uninstalled package file on the local machine, with the correct 4384extension (.gfn or .zip). In this case the effect is to copy the file into 4385place (gfn), or unzip it into place (zip), "into place" meaning where the 4386"include" command will find it. 4387 4388query: The default effect is to print basic information about the specified 4389package (author, version, etc.). But if the --quiet option is appended 4390nothing is printed; the package information is instead stored in the form of 4391a gretl bundle, which can be accessed via "$result". If no information can 4392be found this bundle will be empty. 4393 4394unload: pkgname should be given in plain form, without path or suffix as in 4395the last example above. The effect is to unload the package in question from 4396gretl's memory, if it is currently loaded, and also to remove it from the 4397GUI menu to which it is attached, if any. 4398 4399remove: performs the actions noted for unload and in addition deletes the 4400file(s) associated with the package from disk. 4401 4402index: is a special case in which pkgname must be replaced by the keyword 4403"addons": the effect is to update the index of the standard packages known 4404as addons. Such updating is performed automatically from time to time but in 4405some cases a manual update may be useful. In this case the --verbose flag 4406produces a printout of where gretl has searched and what it has found. To be 4407clear, here's the way to get full indexing output: 4408 4409 pkg index addons --verbose 4410 4411Menu path: /File/Function packages/On server 4412 4413# plot Graphs 4414 4415Argument: [ data ] 4416Options: --with-lines[=varspec] (use lines, not points) 4417 --with-lp[=varspec] (use lines and points) 4418 --with-impulses[=varspec] (use vertical lines) 4419 --with-steps[=varspec] (use horizontal and vertical line segments) 4420 --time-series (plot against time) 4421 --single-yaxis (force use of just one y-axis) 4422 --ylogscale[=base] (use log scale for vertical axis) 4423 --dummy (see below) 4424 --fit=fitspec (see below) 4425 --band=bandspec (see below) 4426 --band-style=style (see below) 4427 --output=filename (send output to specified file) 4428Examples: nile.inp 4429 4430The plot block provides an alternative to the "gnuplot" command which may be 4431more convenient when you are producing an elaborate plot (with several 4432options and/or gnuplot commands to be inserted into the plot file). In 4433addition to the following explanation, please also refer to chapter 6 of the 4434Gretl User's Guide for some further examples. 4435 4436A plot block starts with the command-word plot. This is commonly followed by 4437a data argument, which specifies data to be plotted: this should be the name 4438of a list, a matrix, or a single series. If no input data are specified the 4439block must contain at least one directive to plot a formula instead; such 4440directives may be given via literal or printf lines (see below). 4441 4442If a list or matrix is given, the last element (list) or column (matrix) is 4443assumed to be the x-axis variable and the other(s) the y-axis variable(s), 4444unless the --time-series option is given in which case all the specified 4445data go on the y axis. 4446 4447The option of supplying a single series name is restricted to time-series 4448data, in which case it is assumed that a time-series plot is wanted; 4449otherwise an error is flagged. 4450 4451The starting line may be prefixed with the "savename <-" apparatus to save a 4452plot as an icon in the GUI program. The block ends with end plot. 4453 4454Inside the block you have zero or more lines of these types, identified by 4455an initial keyword: 4456 4457 option: specify a single option. 4458 4459 options: specify multiple options on a single line, separated by spaces. 4460 4461 literal: a command to be passed to gnuplot literally. 4462 4463 printf: a printf statement whose result will be passed to gnuplot 4464 literally. 4465 4466Note that when you specify an option using the option or options keywords, 4467it is not necessary to supply the customary double-dash before the option 4468specifier. For details on the effects of the various options please see 4469"gnuplot" (but see below for some specifics on using the --band option in 4470the plot context). 4471 4472The intended use of the plot block is best illustrated by example: 4473 4474 string title = "My title" 4475 string xname = "My x-variable" 4476 plot plotmat 4477 options with-lines fit=none 4478 literal set linetype 3 lc rgb "#0000ff" 4479 literal set nokey 4480 printf "set title \"%s\"", title 4481 printf "set xlabel \"%s\"", xname 4482 end plot --output=display 4483 4484This example assumes that plotmat is the name of a matrix with at least 2 4485columns (or a list with at least two members). Note that it is considered 4486good practice to place the --output option (only) on the last line of the 4487block; other options should be placed within the block. 4488 4489Plotting a band with matrix data 4490 4491The --band and --band-style options mostly work as described in the help for 4492"gnuplot", with the following exception: when the data to be plotted are 4493given in the form of a matrix, the first parameter to --band must be given 4494as the name of a matrix with two columns (holding, respectively, the center 4495and the width of the band). This parameter takes the place of the two values 4496(series names or ID numbers, or matrix columns) required by the gnuplot 4497version of this option. An illustration follows: 4498 4499 scalar n = 100 4500 matrix x = seq(1,n)' 4501 matrix y = x + filter(mnormal(n,1), 1, {1.8, -0.9}) 4502 matrix B = y ~ muniform(n,1) 4503 plot y 4504 options time-series with-lines 4505 options band=B,10 band-style=fill 4506 end plot --output=display 4507 4508Plotting without data 4509 4510The following example shows a simple case of specifying a plot without a 4511data source. 4512 4513 plot 4514 literal set title 'CRRA utility' 4515 literal set xlabel 'c' 4516 literal set ylabel 'u(c)' 4517 literal set xrange[1:3] 4518 literal set key top left 4519 literal crra(x,s) = (x**(1-s) - 1)/(1-s) 4520 printf "plot crra(x, 0) t 'sigma=0', \\" 4521 printf " log(x) t 'sigma=1', \\" 4522 printf " crra(x,3) t 'sigma=3" 4523 end plot --output=display 4524 4525# poisson Estimation 4526 4527Arguments: depvar indepvars [ ; offset ] 4528Options: --robust (robust standard errors) 4529 --cluster=clustvar (see "logit" for explanation) 4530 --vcv (print covariance matrix) 4531 --verbose (print details of iterations) 4532 --quiet (don't print results) 4533Examples: poisson y 0 x1 x2 4534 poisson y 0 x1 x2 ; S 4535 See also camtriv.inp, greene19_3.inp 4536 4537Estimates a poisson regression. The dependent variable is taken to represent 4538the occurrence of events of some sort, and must take on only non-negative 4539integer values. 4540 4541If a discrete random variable Y follows the Poisson distribution, then 4542 4543 Pr(Y = y) = exp(-v) * v^y / y! 4544 4545for y = 0, 1, 2,.... The mean and variance of the distribution are both 4546equal to v. In the Poisson regression model, the parameter v is represented 4547as a function of one or more independent variables. The most common version 4548(and the only one supported by gretl) has 4549 4550 v = exp(b0 + b1*x1 + b2*x2 + ...) 4551 4552or in other words the log of v is a linear function of the independent 4553variables. 4554 4555Optionally, you may add an "offset" variable to the specification. This is a 4556scale variable, the log of which is added to the linear regression function 4557(implicitly, with a coefficient of 1.0). This makes sense if you expect the 4558number of occurrences of the event in question to be proportional, other 4559things equal, to some known factor. For example, the number of traffic 4560accidents might be supposed to be proportional to traffic volume, other 4561things equal, and in that case traffic volume could be specified as an 4562"offset" in a Poisson model of the accident rate. The offset variable must 4563be strictly positive. 4564 4565By default, standard errors are computed using the negative inverse of the 4566Hessian. If the --robust flag is given, then QML or Huber-White standard 4567errors are calculated instead. In this case the estimated covariance matrix 4568is a "sandwich" of the inverse of the estimated Hessian and the outer 4569product of the gradient. 4570 4571See also "negbin". 4572 4573Menu path: /Model/Limited dependent variable/Count data 4574 4575# print Printing 4576 4577Variants: print varlist 4578 print 4579 print object-names 4580 print string-literal 4581Options: --byobs (by observations) 4582 --no-dates (use simple observation numbers) 4583 --range=start:stop (see below) 4584 --midas (see below) 4585 --tree (specific to bundles; see below) 4586Examples: print x1 x2 --byobs 4587 print my_matrix 4588 print "This is a string" 4589 print my_array --range=3:6 4590 print hflist --midas 4591 4592Please note that print is a rather "basic" command (primarily intended for 4593printing the values of series); see "printf" and "eval" for more advanced, 4594and less restrictive, alternatives. 4595 4596In the first variant shown above (also see the first example), varlist 4597should be a list of series (either a named list or a list specified via the 4598names or ID numbers of series, separated by spaces). In that case this 4599command prints the values of the listed series. By default the data are 4600printed "by variable", but if the --byobs flag is added they are printed by 4601observation. When printing by observation, the default is to show the date 4602(with time-series data) or the observation marker string (if any) at the 4603start of each line. The --no-dates option suppresses the printing of dates 4604or markers; a simple observation number is shown instead. See the final 4605paragraph of this entry for the effect of the --midas option (which applies 4606only to a named list of series). 4607 4608If no argument is given (the second variant shown above) then the action is 4609similar to the first case except that all series in the current dataset are 4610printed. The supported options are as decribed above. 4611 4612The third variant (with the object-names argument; see the second example) 4613expects a space-separated list of names of primary gretl objects other than 4614series (scalars, matrices, strings, bundles, arrays). The value(s) of these 4615objects are displayed. In the case of bundles, their members are sorted by 4616type and alphabetically. 4617 4618In the fourth form (third example), string-literal should be a string 4619enclosed in double-quotes (and there should be nothing else following on the 4620command line). The string in question is printed, followed by a newline 4621character. 4622 4623The --range option can be used to control the amount of information printed. 4624The start and stop (integer) values refer to observations for series and 4625lists, rows for matrices, elements for arrays, and lines of text for 4626strings. In all cases the minimum start value is 1 and the maximum stop 4627value is the "row-wise size" of the object in question. Negative values for 4628these indices are taken to indicate a count back from the end. The indices 4629may be given in numeric form or as the names of predefined scalar variables. 4630If start is omitted that is taken as an implicit 1 and if stop is omitted 4631that means go all the way to the end. Note that with series and lists the 4632indices are relative to the current sample range. 4633 4634The --tree option is specific to the printing of a gretl bundle: the effect 4635is that if the specified bundle contains further bundles, or arrays of 4636bundles, their contents are listed. Otherwise only the top-level members of 4637the bundle are listed. 4638 4639The --midas option is specific to the printing of a list of series, and 4640moreover it is specific to datasets that contain one or more high-frequency 4641series, each represented by a "MIDAS list". If one such list is given as 4642argument and this option is appended, the series is printed by observation 4643at its "native" frequency. 4644 4645Menu path: /Data/Display values 4646 4647# printf Printing 4648 4649Arguments: format , args 4650 4651Prints scalar values, series, matrices, or strings under the control of a 4652format string (providing a subset of the printf function in the C 4653programming language). Recognized numeric formats are %e, %E, %f, %g, %G, %d 4654and %x, in each case with the various modifiers available in C. Examples: 4655the format %.10g prints a value to 10 significant figures; %12.6f prints a 4656value to 6 decimal places, with a width of 12 characters. Note, however, 4657that in gretl the format %g is a good default choice for all numerical 4658values; you don't need to get too complicated. The format %s should be used 4659for strings. 4660 4661The format string itself must be enclosed in double quotes. The values to be 4662printed must follow the format string, separated by commas. These values 4663should take the form of either (a) the names of variables, (b) expressions 4664that are yield some sort of printable result, or (c) the special functions 4665varname() or date(). The following example prints the values of two 4666variables plus that of a calculated expression: 4667 4668 ols 1 0 2 3 4669 scalar b = $coeff[2] 4670 scalar se_b = $stderr[2] 4671 printf "b = %.8g, standard error %.8g, t = %.4f\n", 4672 b, se_b, b/se_b 4673 4674The next lines illustrate the use of the varname and date functions, which 4675respectively print the name of a variable, given its ID number, and a date 4676string, given a 1-based observation number. 4677 4678 printf "The name of variable %d is %s\n", i, varname(i) 4679 printf "The date of observation %d is %s\n", j, date(j) 4680 4681If a matrix argument is given in association with a numeric format, the 4682entire matrix is printed using the specified format for each element. The 4683same applies to series, except that the range of values printed is governed 4684by the current sample setting. 4685 4686The maximum length of a format string is 127 characters. The escape 4687sequences \n (newline), \t (tab), \v (vertical tab) and \\ (literal 4688backslash) are recognized. To print a literal percent sign, use %%. 4689 4690As in C, numerical values that form part of the format (width and or 4691precision) may be given directly as numbers, as in %10.4f, or they may be 4692given as variables. In the latter case, one puts asterisks into the format 4693string and supplies corresponding arguments in order. For example, 4694 4695 scalar width = 12 4696 scalar precision = 6 4697 printf "x = %*.*f\n", width, precision, x 4698 4699# probit Estimation 4700 4701Arguments: depvar indepvars 4702Options: --robust (robust standard errors) 4703 --cluster=clustvar (see "logit" for explanation) 4704 --vcv (print covariance matrix) 4705 --verbose (print details of iterations) 4706 --quiet (don't print results) 4707 --p-values (show p-values instead of slopes) 4708 --estrella (select pseudo-R-squared variant) 4709 --random-effects (estimates a random effects panel probit model) 4710 --quadpoints=k (number of quadrature points for RE estimation) 4711Examples: ooballot.inp, oprobit.inp, reprobit.inp 4712 4713If the dependent variable is a binary variable (all values are 0 or 1) 4714maximum likelihood estimates of the coefficients on indepvars are obtained 4715via the Newton-Raphson method. As the model is nonlinear the slopes depend 4716on the values of the independent variables. By default the slopes with 4717respect to each of the independent variables are calculated (at the means of 4718those variables) and these slopes replace the usual p-values in the 4719regression output. This behavior can be suppressed by giving the --p-values 4720option. The chi-square statistic tests the null hypothesis that all 4721coefficients are zero apart from the constant. 4722 4723By default, standard errors are computed using the negative inverse of the 4724Hessian. If the --robust flag is given, then QML or Huber-White standard 4725errors are calculated instead. In this case the estimated covariance matrix 4726is a "sandwich" of the inverse of the estimated Hessian and the outer 4727product of the gradient. See chapter 10 of Davidson and MacKinnon for 4728details. 4729 4730By default the pseudo-R-squared statistic suggested by McFadden (1974) is 4731shown, but in the binary case if the --estrella option is given, the variant 4732recommended by Estrella (1998) is shown instead. This variant arguably 4733mimics more closely the properties of the regular R^2 in the context of 4734least-squares estimation. 4735 4736If the dependent variable is not binary but is discrete, then Ordered Probit 4737estimates are obtained. (If the variable selected as dependent is not 4738discrete, an error is flagged.) 4739 4740Probit for panel data 4741 4742With the --random-effects option, the error term is assumed to be composed 4743of two normally distributed components: one time-invariant term that is 4744specific to the cross-sectional unit or "individual" (and is known as the 4745individual effect); and one term that is specific to the particular 4746observation. 4747 4748Evaluation of the likelihood for this model involves the use of 4749Gauss-Hermite quadrature for approximating the value of expectations of 4750functions of normal variates. The number of quadrature points used can be 4751chosen through the --quadpoints option (the default is 32). Using more 4752points will increase the accuracy of the results, but at the cost of longer 4753compute time; with many quadrature points and a large dataset estimation may 4754be quite time consuming. 4755 4756Besides the usual parameter estimates (and associated statistics) relating 4757to the included regressors, certain additional information is presented on 4758estimation of this sort of model: 4759 4760 lnsigma2: the maximum likelihood estimate of the log of the variance of 4761 the individual effect; 4762 4763 sigma_u: the estimated standard deviation of the individual effect; and 4764 4765 rho: the estimated share of the individual effect in the composite error 4766 variance (also known as the intra-class correlation). 4767 4768The Likelihood Ratio test of the null hypothesis that rho equals zero 4769provides a means of assessing whether the random effects specification is 4770needed. If the null is not rejected that suggests that a simple pooled 4771probit specification is adequate. 4772 4773Menu path: /Model/Limited dependent variable/Probit 4774 4775# pvalue Statistics 4776 4777Arguments: dist [ params ] xval 4778Examples: pvalue z zscore 4779 pvalue t 25 3.0 4780 pvalue X 3 5.6 4781 pvalue F 4 58 fval 4782 pvalue G shape scale x 4783 pvalue B bprob 10 6 4784 pvalue P lambda x 4785 pvalue W shape scale x 4786 See also mrw.inp, restrict.inp 4787 4788Computes the area to the right of xval in the specified distribution (z for 4789Gaussian, t for Student's t, X for chi-square, F for F, G for gamma, B for 4790binomial, P for Poisson, exp for Exponential, W for Weibull). 4791 4792Depending on the distribution, the following information must be given, 4793before the xval: for the t and chi-square distributions, the degrees of 4794freedom; for F, the numerator and denominator degrees of freedom; for gamma, 4795the shape and scale parameters; for the binomial distribution, the "success" 4796probability and the number of trials; for the Poisson distribution, the 4797parameter lambda (which is both the mean and the variance); for the 4798Exponential, a scale parameter; and for the Weibull, shape and scale 4799parameters. As shown in the examples above, the numerical parameters may be 4800given in numeric form or as the names of variables. 4801 4802The parameters for the gamma distribution are sometimes given as mean and 4803variance rather than shape and scale. The mean is the product of the shape 4804and the scale; the variance is the product of the shape and the square of 4805the scale. So the scale may be found as the variance divided by the mean, 4806and the shape as the mean divided by the scale. 4807 4808Menu path: /Tools/P-value finder 4809 4810# qlrtest Tests 4811 4812Options: --limit-to=list (limit test to subset of regressors) 4813 --plot=mode-or-filename (see below) 4814 --quiet (suppress printed output) 4815 4816For a model estimated on time-series data via OLS, performs the Quandt 4817likelihood ratio (QLR) test for a structural break at an unknown point in 4818time, with 15 percent trimming at the beginning and end of the sample 4819period. 4820 4821For each potential break point within the central 70 percent of the 4822observations, a Chow test is performed. See "chow" for details; as with the 4823regular Chow test, this is a robust Wald test if the original model was 4824estimated with the --robust option, an F-test otherwise. The QLR statistic 4825is then the maximum of the individual test statistics. 4826 4827An asymptotic p-value is obtained using the method of Bruce Hansen (1997). 4828 4829Besides the standard hypothesis test accessors "$test" and "$pvalue", 4830"$qlrbreak" can be used to retrieve the index of the observation at which 4831the test statistic is maximized. 4832 4833The --limit-to option can be used to limit the set of interactions with the 4834split dummy variable in the Chow tests to a subset of the original 4835regressors. The parameter for this option must be a named list, all of whose 4836members are among the original regressors. The list should not include the 4837constant. 4838 4839When this command is run interactively (only), a plot of the Chow test 4840statistic is displayed by default. This can be adjusted via the --plot 4841option. The acceptable parameters to this option are none (to suppress the 4842plot); display (to display a plot even when not in interactive mode); or a 4843file name. The effect of providing a file name is as described for the 4844--output option of the "gnuplot" command. 4845 4846Menu path: Model window, /Tests/QLR test 4847 4848# qqplot Graphs 4849 4850Variants: qqplot y 4851 qqplot y x 4852Options: --z-scores (see below) 4853 --raw (see below) 4854 --output=filename (send plot to specified file) 4855 4856Given just one series argument, displays a plot of the empirical quantiles 4857of the selected series (given by name or ID number) against the quantiles of 4858the normal distribution. The series must include at least 20 valid 4859observations in the current sample range. By default the empirical quantiles 4860are plotted against quantiles of the normal distribution having the same 4861mean and variance as the sample data, but two alternatives are available: if 4862the --z-scores option is given the data are standardized, while if the --raw 4863option is given the "raw" empirical quantiles are plotted against the 4864quantiles of the standard normal distribution. 4865 4866The option --output has the effect of sending the output to the specified 4867file; use "display" to force output to the screen. See the "gnuplot" command 4868for more detail on this option. 4869 4870Given two series arguments, y and x, displays a plot of the empirical 4871quantiles of y against those of x. The data values are not standardized. 4872 4873Menu path: /Variable/Normal Q-Q plot 4874Menu path: /View/Graph specified vars/Q-Q plot 4875 4876# quantreg Estimation 4877 4878Arguments: tau depvar indepvars 4879Options: --robust (robust standard errors) 4880 --intervals[=level] (compute confidence intervals) 4881 --vcv (print covariance matrix) 4882 --quiet (suppress printing of results) 4883Examples: quantreg 0.25 y 0 xlist 4884 quantreg 0.5 y 0 xlist --intervals 4885 quantreg 0.5 y 0 xlist --intervals=.95 4886 quantreg tauvec y 0 xlist --robust 4887 See also mrw_qr.inp 4888 4889Quantile regression. The first argument, tau, is the conditional quantile 4890for which estimates are wanted. It may be given either as a numerical value 4891or as the name of a pre-defined scalar variable; the value must be in the 4892range 0.01 to 0.99. (Alternatively, a vector of values may be given for tau; 4893see below for details.) The second and subsequent arguments compose a 4894regression list on the same pattern as "ols". 4895 4896Without the --intervals option, standard errors are printed for the quantile 4897estimates. By default, these are computed according to the asymptotic 4898formula given by Koenker and Bassett (1978), but if the --robust option is 4899given, standard errors that are robust with respect to heteroskedasticity 4900are calculated using the method of Koenker and Zhao (1994). 4901 4902When the --intervals option is chosen, confidence intervals are given for 4903the parameter estimates instead of standard errors. These intervals are 4904computed using the rank inversion method, and in general they are 4905asymmetrical about the point estimates. The specifics of the calculation are 4906inflected by the --robust option: without this, the intervals are computed 4907on the assumption of IID errors (Koenker, 1994); with it, they use the 4908robust estimator developed by Koenker and Machado (1999). 4909 4910By default, 90 percent confidence intervals are produced. You can change 4911this by appending a confidence level (expressed as a decimal fraction) to 4912the intervals option, as in --intervals=0.95. 4913 4914Vector-valued tau: instead of supplying a scalar, you may give the name of a 4915pre-defined matrix. In this case estimates are computed for all the given 4916tau values and the results are printed in a special format, showing the 4917sequence of quantile estimates for each regressor in turn. 4918 4919Menu path: /Model/Robust estimation/Quantile regression 4920 4921# quit Utilities 4922 4923Exits from gretl's current modality. 4924 4925 When called from a script, execution of the script is terminated. If the 4926 context is gretlcli in batch mode, gretlcli itself exits, otherwise the 4927 program reverts to interactive mode. 4928 4929 When called from the GUI console, the console window is closed. 4930 4931 When called from gretlcli in interactive mode the program exits. 4932 4933Note that this command cannot be called within functions or loops. 4934 4935In no case does the quit command cause the gretl GUI program to exit. That 4936is done via the Quit item under the File menu, or Ctrl+Q, or by clicking the 4937close control on the title-bar of the main gretl window. 4938 4939# rename Dataset 4940 4941Arguments: series newname 4942Option: --quiet (suppress printed output) 4943 4944Changes the name of series (identified by name or ID number) to newname. The 4945new name must be of 31 characters maximum, must start with a letter, and 4946must be composed of only letters, digits, and the underscore character. In 4947addition, it must not be the name of an existing object of any kind. 4948 4949Menu path: /Variable/Edit attributes 4950Other access: Main window pop-up menu (single selection) 4951 4952# reset Tests 4953 4954Options: --quiet (don't print the auxiliary regression) 4955 --silent (don't print anything) 4956 --squares-only (compute the test using only the squares) 4957 --cubes-only (compute the test using only the cubes) 4958 4959Must follow the estimation of a model via OLS. Carries out Ramsey's RESET 4960test for model specification (nonlinearity) by adding the squares and/or the 4961cubes of the fitted values to the regression and calculating the F statistic 4962for the null hypothesis that the coefficients on the added terms are zero. 4963 4964Both the squares and the cubes are added unless one of the options 4965--squares-only or --cubes-only is given. 4966 4967The --silent option may be used if one plans to make use of the "$test" 4968and/or "$pvalue" accessors to grab the results of the test. 4969 4970Menu path: Model window, /Tests/Ramsey's RESET 4971 4972# restrict Tests 4973 4974Options: --quiet (don't print restricted estimates) 4975 --silent (don't print anything) 4976 --wald (system estimators only - see below) 4977 --bootstrap (bootstrap the test if possible) 4978 --full (OLS and VECMs only, see below) 4979Examples: hamilton.inp, restrict.inp 4980 4981Imposes a set of (usually linear) restrictions on either (a) the model last 4982estimated or (b) a system of equations previously defined and named. In all 4983cases the set of restrictions should be started with the keyword "restrict" 4984and terminated with "end restrict". 4985 4986In the single equation case the restrictions are always implicitly to be 4987applied to the last model, and they are evaluated as soon as the restrict 4988block is closed. 4989 4990In the case of a system of equations (defined via the "system" command), the 4991initial "restrict" may be followed by the name of a previously defined 4992system of equations. If this is omitted and the last model was a system then 4993the restrictions are applied to the last model. By default the restrictions 4994are evaluated when the system is next estimated, using the "estimate" 4995command. But if the --wald option is given the restriction is tested right 4996away, via a Wald chi-square test on the covariance matrix. Note that this 4997option will produce an error if a system has been defined but not yet 4998estimated. 4999 5000Depending on the context, the restrictions to be tested may be expressed in 5001various ways. The simplest form is as follows: each restriction is given as 5002an equation, with a linear combination of parameters on the left and a 5003scalar value to the right of the equals sign (either a numerical constant or 5004the name of a scalar variable). 5005 5006In the single-equation case, parameters may be referenced in the form b[i], 5007where i represents the position in the list of regressors (starting at 1), 5008or b[varname], where varname is the name of the regressor in question. In 5009the system case, parameters are referenced using b plus two numbers in 5010square brackets. The leading number represents the position of the equation 5011within the system and the second number indicates position in the list of 5012regressors. For example b[2,1] denotes the first parameter in the second 5013equation, and b[3,2] the second parameter in the third equation. The b terms 5014in the equation representing a restriction may be prefixed with a numeric 5015multiplier, for example 3.5*b[4]. 5016 5017Here is an example of a set of restrictions for a previously estimated 5018model: 5019 5020 restrict 5021 b[1] = 0 5022 b[2] - b[3] = 0 5023 b[4] + 2*b[5] = 1 5024 end restrict 5025 5026And here is an example of a set of restrictions to be applied to a named 5027system. (If the name of the system does not contain spaces, the surrounding 5028quotes are not required.) 5029 5030 restrict "System 1" 5031 b[1,1] = 0 5032 b[1,2] - b[2,2] = 0 5033 b[3,4] + 2*b[3,5] = 1 5034 end restrict 5035 5036In the single-equation case the restrictions are by default evaluated via a 5037Wald test, using the covariance matrix of the model in question. If the 5038original model was estimated via OLS then the restricted coefficient 5039estimates are printed; to suppress this, append the --quiet option flag to 5040the initial restrict command. As an alternative to the Wald test, for models 5041estimated via OLS or WLS only, you can give the --bootstrap option to 5042perform a bootstrapped test of the restriction. 5043 5044In the system case, the test statistic depends on the estimator chosen: a 5045Likelihood Ratio test if the system is estimated using a Maximum Likelihood 5046method, or an asymptotic F-test otherwise. 5047 5048There are three alternatives to the method of expressing restrictions 5049described above. First, a set of g linear restrictions on a k-vector of 5050parameters, beta, may be written compactly as Rbeta - q = 0, where R is an g 5051x k matrix and q is a g-vector. You can specify a restriction by giving the 5052names of pre-defined, conformable matrices to be used as R and q, as in 5053 5054 restrict 5055 R = Rmat 5056 q = qvec 5057 end restrict 5058 5059Second, in a variant that may be useful when restrict is used within a 5060function, you can construct the set of restriction statements in the form of 5061an array of strings. You then use the inject keyword with the name of the 5062array. Here's a simple example: 5063 5064 strings SR = array(2) 5065 RS[1] = "b[1,2] = 0" 5066 RS[2] = "b[2,1] = 0" 5067 restrict 5068 inject RS 5069 end restrict 5070 5071In actual usage of this method one would likely use "sprintf" to construct 5072the strings, based on input to a function. 5073 5074Lastly, if you wish to test a nonlinear restriction (this is currently 5075available for single-equation models only) you should give the restriction 5076as the name of a function, preceded by "rfunc = ", as in 5077 5078 restrict 5079 rfunc = myfunction 5080 end restrict 5081 5082The constraint function should take a single const matrix argument; this 5083will be automatically filled out with the parameter vector. And it should 5084return a vector which is zero under the null hypothesis, non-zero otherwise. 5085The length of the vector is the number of restrictions. This function is 5086used as a "callback" by gretl's numerical Jacobian routine, which calculates 5087a Wald test statistic via the delta method. 5088 5089Here is a simple example of a function suitable for testing one nonlinear 5090restriction, namely that two pairs of parameter values have a common ratio. 5091 5092 function matrix restr (const matrix b) 5093 matrix v = b[1]/b[2] - b[4]/b[5] 5094 return v 5095 end function 5096 5097On successful completion of the restrict command the accessors "$test" and 5098"$pvalue" give the test statistic and its p-value. 5099 5100When testing restrictions on a single-equation model estimated via OLS, or 5101on a VECM, the --full option can be used to set the restricted estimates as 5102the "last model" for the purposes of further testing or the use of accessors 5103such as $coeff and $vcv. Note that some special considerations apply in the 5104case of testing restrictions on Vector Error Correction Models. Please see 5105chapter 33 of the Gretl User's Guide for details. 5106 5107Menu path: Model window, /Tests/Linear restrictions 5108 5109# rmplot Graphs 5110 5111Argument: series 5112Options: --trim (see below) 5113 --quiet (suppress printed output) 5114 --output=filename (see below) 5115 5116Range-mean plot: this command creates a simple graph to help in deciding 5117whether a time series, y(t), has constant variance or not. We take the full 5118sample t=1,...,T and divide it into small subsamples of arbitrary size k. 5119The first subsample is formed by y(1),...,y(k), the second is y(k+1), ..., 5120y(2k), and so on. For each subsample we calculate the sample mean and range 5121(= maximum minus minimum), and we construct a graph with the means on the 5122horizontal axis and the ranges on the vertical. So each subsample is 5123represented by a point in this plane. If the variance of the series is 5124constant we would expect the subsample range to be independent of the 5125subsample mean; if we see the points approximate an upward-sloping line this 5126suggests the variance of the series is increasing in its mean; and if the 5127points approximate a downward sloping line this suggests the variance is 5128decreasing in the mean. 5129 5130Besides the graph, gretl displays the means and ranges for each subsample, 5131along with the slope coefficient for an OLS regression of the range on the 5132mean and the p-value for the null hypothesis that this slope is zero. If the 5133slope coefficient is significant at the 10 percent significance level then 5134the fitted line from the regression of range on mean is shown on the graph. 5135The t-statistic for the null, and the corresponding p-value, are recorded 5136and may be retrieved using the accessors "$test" and "$pvalue" respectively. 5137 5138If the --trim option is given, the minimum and maximum values in each 5139sub-sample are discarded before calculating the mean and range. This makes 5140it less likely that outliers will distort the analysis. 5141 5142If the --quiet option is given, no graph is shown and no output is printed; 5143only the t-statistic and p-value are recorded. Otherwise the form of the 5144plot can be controlled via the --output option; this works as described in 5145connection with the "gnuplot" command. 5146 5147Menu path: /Variable/Range-mean graph 5148 5149# run Programming 5150 5151Argument: filename 5152 5153Executes the commands in filename then returns control to the interactive 5154prompt. This command is intended for use with the command-line program 5155gretlcli, or at the "gretl console" in the GUI program. 5156 5157See also "include". 5158 5159Menu path: Run icon in script window 5160 5161# runs Tests 5162 5163Argument: series 5164Options: --difference (use first difference of variable) 5165 --equal (positive and negative values are equiprobable) 5166 5167Carries out the nonparametric "runs" test for randomness of the specified 5168series, where runs are defined as sequences of consecutive positive or 5169negative values. If you want to test for randomness of deviations from the 5170median, for a variable named x1 with a non-zero median, you can do the 5171following: 5172 5173 series signx1 = x1 - median(x1) 5174 runs signx1 5175 5176If the --difference option is given, the variable is differenced prior to 5177the analysis, hence the runs are interpreted as sequences of consecutive 5178increases or decreases in the value of the variable. 5179 5180If the --equal option is given, the null hypothesis incorporates the 5181assumption that positive and negative values are equiprobable, otherwise the 5182test statistic is invariant with respect to the "fairness" of the process 5183generating the sequence, and the test focuses on independence alone. 5184 5185Menu path: /Tools/Nonparametric tests 5186 5187# scatters Graphs 5188 5189Arguments: yvar ; xvars or yvars ; xvar 5190Options: --with-lines (create line graphs) 5191 --matrix=name (plot columns of named matrix) 5192 --output=filename (send output to specified file) 5193Examples: scatters 1 ; 2 3 4 5 5194 scatters 1 2 3 4 5 6 ; 7 5195 scatters y1 y2 y3 ; x --with-lines 5196 5197Generates pairwise graphs of yvar against all the variables in xvars, or of 5198all the variables in yvars against xvar. The first example above puts 5199variable 1 on the y-axis and draws four graphs, the first having variable 2 5200on the x-axis, the second variable 3 on the x-axis, and so on. The second 5201example plots each of variables 1 through 6 against variable 7 on the 5202x-axis. Scanning a set of such plots can be a useful step in exploratory 5203data analysis. The maximum number of plots is 16; any extra variable in the 5204list will be ignored. 5205 5206By default the graphs are scatterplots, but if you give the --with-lines 5207flag they will be line graphs. 5208 5209For details on usage of the --output option, please see the "gnuplot" 5210command. 5211 5212If a named matrix is specified as the data source the x and y lists should 5213be given as 1-based column numbers; or alternatively, if no such numbers are 5214given, all the columns are plotted against time or an index variable. 5215 5216If the dataset is time-series, then the second sub-list can be omitted, in 5217which case it will implicitly be taken as "time", so you can plot multiple 5218time series in separated sub-graphs. 5219 5220Menu path: /View/Multiple graphs 5221 5222# sdiff Transformations 5223 5224Argument: varlist 5225 5226The seasonal difference of each variable in varlist is obtained and the 5227result stored in a new variable with the prefix sd_. This command is 5228available only for seasonal time series. 5229 5230Menu path: /Add/Seasonal differences of selected variables 5231 5232# set Programming 5233 5234Variants: set variable value 5235 set --to-file=filename 5236 set --from-file=filename 5237 set stopwatch 5238 set 5239Examples: set svd on 5240 set csv_delim tab 5241 set horizon 10 5242 set --to-file=mysettings.inp 5243 5244The most common use of this command is the first variant shown above, where 5245it is used to set the value of a selected program parameter. This is 5246discussed in detail below. The other uses are: with --to-file, to write a 5247script file containing all the current parameter settings; with --from-file 5248to read a script file containing parameter settings and apply them to the 5249current session; with stopwatch to zero the gretl "stopwatch" which can be 5250used to measure CPU time (see the entry for the "$stopwatch" accessor); or, 5251if the word set is given alone, to print the current settings. 5252 5253Values set via this comand remain in force for the duration of the gretl 5254session unless they are changed by a further call to "set". The parameters 5255that can be set in this way are enumerated below. Note that the settings of 5256hc_version, hac_lag and hac_kernel are used when the --robust option is 5257given to an estimation command. 5258 5259The available settings are grouped under the following categories: program 5260interaction and behavior, numerical methods, random number generation, 5261robust estimation, filtering, time series estimation, and interaction with 5262GNU R. 5263 5264Program interaction and behavior 5265 5266These settings are used for controlling various aspects of the way gretl 5267interacts with the user. 5268 5269 workdir: path. Sets the default directory for writing and reading files, 5270 whenever full paths are not specified. 5271 5272 use_cwd: on or off (the default). Governs the setting of workdir at 5273 start-up: if it's on, the working directory is inherited from the shell, 5274 otherwise it is set to whatever was selected in the previous gretl 5275 session. 5276 5277 echo: off or on (the default). Suppress or resume the echoing of commands 5278 in gretl's output. 5279 5280 messages: off or on (the default). Suppress or resume the printing of 5281 non-error messages associated with various commands, for example when a 5282 new variable is generated or when the sample range is changed. 5283 5284 verbose: off, on (the default) or comments. Acts as a "master switch" for 5285 echo and messages (see above), turning them both off or on simultaneously. 5286 The comments argument turns off echo and messages but preserves printing 5287 of comments in a script. 5288 5289 warnings: off or on (the default). Suppress or resume the printing of 5290 warning messages issued when arithmetical operations produce non-finite 5291 values. 5292 5293 csv_delim: either comma (the default), space, tab or semicolon. Sets the 5294 column delimiter used when saving data to file in CSV format. 5295 5296 csv_write_na: the string used to represent missing values when writing 5297 data to file in CSV format. Maximum 7 characters; the default is NA. 5298 5299 csv_read_na: the string taken to represent missing values (NAs) when 5300 reading data in CSV format. Maximum 7 characters. The default depends on 5301 whether a data column is found to contain numerical data (mostly) or 5302 string values. For numerical data the following are taken as indicating 5303 NAs: an empty cell, or any of the strings NA, N.A., na, n.a., N/A, #N/A, 5304 NaN, .NaN, ., .., -999, and -9999. For string-valued data only a blank 5305 cell, or a cell containing an empty string, is counted as NA. These 5306 defaults can be reimposed by giving default as the value for csv_read_na. 5307 To specify that only empty cells are read as NAs, give a value of "". Note 5308 that empty cells are always read as NAs regardless of the setting of this 5309 variable. 5310 5311 csv_digits: a positive integer specifying the number of significant digits 5312 to use when writing data in CSV format. By default up to 15 digits are 5313 used depending on the precision of the original data. Note that CSV output 5314 employs the C library's fprintf function with "%g" conversion, which means 5315 that trailing zeros are dropped. 5316 5317 display_digits: an integer from 3 to 6, specifying the number of 5318 significant digits to use when displaying regression coefficients and 5319 standard errors (the default being 6). This setting can also be used to 5320 limit the number of digits shown by the "summary" command; in this case 5321 the default (and also the maximum) is 5, or 4 when the --simple option is 5322 given. 5323 5324 mwrite_g: on or off (the default). When writing a matrix to file as text, 5325 gretl by default uses scientific notation with 18-digit precision, hence 5326 ensuring that the stored values are a faithful representation of the 5327 numbers in memory. When writing primary data with no more than 6 digits of 5328 precision it may be preferable to use %g format for a more compact and 5329 human-readable file; you can make this switch via set mwrite_g on. 5330 5331 force_decpoint: on or off (the default). Force gretl to use the decimal 5332 point character, in a locale where another character (most likely the 5333 comma) is the standard decimal separator. 5334 5335 loop_maxiter: one non-negative integer value (default 100000). Sets the 5336 maximum number of iterations that a while loop is allowed before halting 5337 (see "loop"). Note that this setting only affects the while variant; its 5338 purpose is to guard against inadvertently infinite loops. Setting this 5339 value to 0 has the effect of disabling the limit; use with caution. 5340 5341 max_verbose: off (the default), on or full. Controls the verbosity of 5342 commands and functions that use numerical optimization methods. The on 5343 choice applies only to functions (such as "BFGSmax" and "NRmax") which 5344 work silently by default; the effect is to print basic iteration 5345 information. The full setting can be used to trigger more detailed output, 5346 including parameter values and their respective gradient for the objective 5347 function at each iteration. This choice applies both to functions of the 5348 above-mentioned sort and to commands that rely on numerical optimization 5349 such as "arima", "probit" and "mle". In the case of commands the effect is 5350 to make their --verbose option produce more detail. See also chapter 37 of 5351 the Gretl User's Guide. 5352 5353 debug: 1, 2 or 0 (the default). This is for use with user-defined 5354 functions. Setting debug to 1 is equivalent to turning messages on within 5355 all such functions; setting this variable to 2 has the additional effect 5356 of turning on max_verbose within all functions. 5357 5358 shell_ok: on or off (the default). Enable launching external programs from 5359 gretl via the system shell. This is disabled by default for security 5360 reasons, and can only be enabled via the graphical user interface 5361 (Tools/Preferences/General). However, once set to on, this setting will 5362 remain active for future sessions until explicitly disabled. 5363 5364 bfgs_verbskip: one integer. This setting affects the behavior of the 5365 --verbose option to those commands that use BFGS as an optimization 5366 algorithm and is used to compact output. if bfgs_verbskip is set to, say, 5367 3, then the --verbose switch will only print iterations 3, 6, 9 and so on. 5368 5369 skip_missing: on (the default) or off. Controls gretl's behavior when 5370 contructing a matrix from data series: the default is to skip data rows 5371 that contain one or more missing values but if skip_missing is set off 5372 missing values are converted to NaNs. 5373 5374 matrix_mask: the name of a series, or the keyword null. Offers greater 5375 control than skip_missing when constructing matrices from series: the data 5376 rows selected for matrices are those with non-zero (and non-missing) 5377 values in the specified series. The selected mask remains in force until 5378 it is replaced, or removed via the null keyword. 5379 5380 quantile_type: must be one of Q6 (the default), Q7 or Q8. Selects the 5381 specific method used by the "quantile" function. For details see Hyndman 5382 and Fan (1996) or the Wikipedia entry at 5383 https://en.wikipedia.org/wiki/Quantile. 5384 5385 huge: a large positive number (by default, 1.0E100). This setting controls 5386 the value returned by the accessor "$huge". 5387 5388 assert: off (the default), warn or stop. Controls the consequences of 5389 failure (return value of 0) from the "assert" function. 5390 5391 datacols: an integer from 1 to 15, with default value 5. Sets the maximum 5392 number of series shown side-by-side when data are displayed by 5393 observation. 5394 5395 plot_collection: on, auto or off. This setting affects the way plots are 5396 displayed during interactive use. If it's on, plots of the same pixel size 5397 are gathered in a "plot collection", that is a single output window in 5398 which you can browse through the various plots going back and forth. With 5399 the off setting, instead, a different window for each plot will be 5400 generated, as in older gretl versions. Finally, the auto setting has the 5401 effect of enabling the plot collection mode only for graphs that are 5402 generated within 1.25 seconds from one another (for example, as a result 5403 of executing plotting commands in a loop). 5404 5405Numerical methods 5406 5407These settings are used for controlling the numerical algorithms that gretl 5408uses for estimation. 5409 5410 optimizer: either auto (the default), BFGS or newton. Sets the 5411 optimization algorithm used for various ML estimators, in cases where both 5412 BFGS and Newton-Raphson are applicable. The default is to use 5413 Newton-Raphson where an analytical Hessian is available, otherwise BFGS. 5414 5415 bhhh_maxiter: one integer, the maximum number of iterations for gretl's 5416 internal BHHH routine, which is used in the "arma" command for conditional 5417 ML estimation. If convergence is not achieved after bhhh_maxiter, the 5418 program returns an error. The default is set at 500. 5419 5420 bhhh_toler: one floating point value, or the string default. This is used 5421 in gretl's internal BHHH routine to check if convergence has occurred. The 5422 algorithm stops iterating as soon as the increment in the log-likelihood 5423 between iterations is smaller than bhhh_toler. The default value is 5424 1.0E-06; this value may be re-established by typing default in place of a 5425 numeric value. 5426 5427 bfgs_maxiter: one integer, the maximum number of iterations for gretl's 5428 BFGS routine, which is used for "mle", "gmm" and several specific 5429 estimators. If convergence is not achieved in the specified number of 5430 iterations, the program returns an error. The default value depends on the 5431 context, but is typically of the order of 500. 5432 5433 bfgs_toler: one floating point value, or the string default. This is used 5434 in gretl's BFGS routine to check if convergence has occurred. The 5435 algorithm stops as soon as the relative improvement in the objective 5436 function between iterations is smaller than bfgs_toler. The default value 5437 is the machine precision to the power 3/4; this value may be 5438 re-established by typing default in place of a numeric value. 5439 5440 bfgs_maxgrad: one floating point value. This is used in gretl's BFGS 5441 routine to check if the norm of the gradient is reasonably close to zero 5442 when the bfgs_toler criterion is met. A warning is printed if the norm of 5443 the gradient exceeds 1; an error is flagged if the norm exceeds 5444 bfgs_maxgrad. At present the default is the permissive value of 5.0. 5445 5446 bfgs_richardson: on or off (the default). Use Richardson extrapolation 5447 when computing numerical derivatives in the context of BFGS maximization. 5448 5449 initvals: the name of a predefined matrix. Allows manual setting of the 5450 initial parameter vector for certain estimation commands that involve 5451 numerical optimization: arma, garch, logit and probit, tobit and intreg, 5452 biprobit, duration, poisson, negbin, and also when imposing certain sorts 5453 of restriction associated with VECMs. Unlike other settings, initvals is 5454 not persistent: it resets to the default initializer after its first use. 5455 For details in connection with ARMA estimation see chapter 31 of the Gretl 5456 User's Guide. 5457 5458 lbfgs: on or off (the default). Use the limited-memory version of BFGS 5459 (L-BFGS-B) instead of the ordinary algorithm. This may be advantageous 5460 when the function to be maximized is not globally concave. 5461 5462 lbfgs_mem: an integer value in the range 3 to 20 (with a default value of 5463 8). This determines the number of corrections used in the limited memory 5464 matrix when L-BFGS-B is employed. 5465 5466 nls_toler: a floating-point value. Sets the tolerance used in judging 5467 whether or not convergence has occurred in nonlinear least squares 5468 estimation using the "nls" command. The default value is the machine 5469 precision to the power 3/4; this value may be re-established by typing 5470 default in place of a numeric value. 5471 5472 svd: on or off (the default). Use SVD rather than Cholesky or QR 5473 decomposition in least squares calculations. This option applies to the 5474 mols function as well as various internal calculations, but not to the 5475 regular "ols" command. 5476 5477 force_qr: on or off (the default). This applies to the "ols" command. By 5478 default this command computes OLS estimates using Cholesky decomposition 5479 (the fastest method), with a fallback to QR if the data seem too 5480 ill-conditioned. You can use force_qr to skip the Cholesky step; in 5481 "doubtful" cases this may ensure greater accuracy. 5482 5483 fcp: on or off (the default). Use the algorithm of Fiorentini, Calzolari 5484 and Panattoni rather than native gretl code when computing GARCH 5485 estimates. 5486 5487 gmm_maxiter: one integer, the maximum number of iterations for gretl's 5488 "gmm" command when in iterated mode (as opposed to one- or two-step). The 5489 default value is 250. 5490 5491 nadarwat_trim: one integer, the trim parameter used in the "nadarwat" 5492 function. 5493 5494 fdjac_quality: one integer (0, 1 or 2), the algorithm used by the "fdjac" 5495 function; the default is 0. 5496 5497Random number generation 5498 5499 seed: an unsigned integer. Sets the seed for the pseudo-random number 5500 generator. By default this is set from the system time; if you want to 5501 generate repeatable sequences of random numbers you must set the seed 5502 manually. 5503 5504Robust estimation 5505 5506 bootrep: an integer. Sets the number of replications for the "restrict" 5507 command with the --bootstrap option. 5508 5509 garch_vcv: unset, hessian, im (information matrix) , op (outer product 5510 matrix), qml (QML estimator), bw (Bollerslev-Wooldridge). Specifies the 5511 variant that will be used for estimating the coefficient covariance 5512 matrix, for GARCH models. If unset is given (the default) then the Hessian 5513 is used unless the "robust" option is given for the garch command, in 5514 which case QML is used. 5515 5516 arma_vcv: hessian (the default) or op (outer product matrix). Specifies 5517 the variant to be used when computing the covariance matrix for ARIMA 5518 models. 5519 5520 force_hc: off (the default) or on. By default, with time-series data and 5521 when the --robust option is given with ols, the HAC estimator is used. If 5522 you set force_hc to "on", this forces calculation of the regular 5523 Heteroskedasticity Consistent Covariance Matrix (HCCM), which does not 5524 take autocorrelation into account. Note that VARs are treated as a special 5525 case: when the --robust option is given the default method is regular 5526 HCCM, but the --robust-hac flag can be used to force the use of a HAC 5527 estimator. 5528 5529 robust_z: off (the default) or on. This controls the distribution used 5530 when calculating p-values based on robust standard errors in the context 5531 of least-squares estimators. By default gretl uses the Student t 5532 distribution but if robust_z is turned on the normal distribution is used. 5533 5534 hac_lag: nw1 (the default), nw2, nw3 or an integer. Sets the maximum lag 5535 value or bandwidth, p, used when calculating HAC (Heteroskedasticity and 5536 Autocorrelation Consistent) standard errors using the Newey-West approach, 5537 for time series data. nw1 and nw2 represent two variant automatic 5538 calculations based on the sample size, T: for nw1, p = 0.75 * T^(1/3), and 5539 for nw2, p = 4 * (T/100)^(2/9). nw3 calls for data-based bandwidth 5540 selection. See also qs_bandwidth and hac_prewhiten below. 5541 5542 hac_kernel: bartlett (the default), parzen, or qs (Quadratic Spectral). 5543 Sets the kernel, or pattern of weights, used when calculating HAC standard 5544 errors. 5545 5546 hac_prewhiten: on or off (the default). Use Andrews-Monahan prewhitening 5547 and re-coloring when computing HAC standard errors. This also implies use 5548 of data-based bandwidth selection. 5549 5550 hc_version: 0 (the default), 1, 2, 3 or 3a. Sets the variant used when 5551 calculating Heteroskedasticity Consistent standard errors with 5552 cross-sectional data. The first four options correspond to the HC0, HC1, 5553 HC2 and HC3 discussed by Davidson and MacKinnon in Econometric Theory and 5554 Methods, chapter 5. HC0 produces what are usually called "White's standard 5555 errors". Variant 3a is the MacKinnon-White "jackknife" procedure. 5556 5557 pcse: off (the default) or on. By default, when estimating a model using 5558 pooled OLS on panel data with the --robust option, the Arellano estimator 5559 is used for the covariance matrix. If you set pcse to "on", this forces 5560 use of the Beck and Katz Panel Corrected Standard Errors (which do not 5561 take autocorrelation into account). 5562 5563 qs_bandwidth: Bandwidth for HAC estimation in the case where the Quadratic 5564 Spectral kernel is selected. (Unlike the Bartlett and Parzen kernels, the 5565 QS bandwidth need not be an integer.) 5566 5567Time series 5568 5569 horizon: one integer (the default is based on the frequency of the data). 5570 Sets the horizon for impulse responses and forecast variance 5571 decompositions in the context of vector autoregressions. 5572 5573 vecm_norm: phillips (the default), diag, first or none. Used in the 5574 context of VECM estimation via the "vecm" command for identifying the 5575 cointegration vectors. See the chapter 33 of the Gretl User's Guide for 5576 details. 5577 5578 boot_iters: one integer, B. Sets the number of bootstrap iterations used 5579 when computing impulse response functions with confidence intervals. The 5580 default is 1999. It is recommended that B + 1 is evenly divisible by 5581 100α/2, so for example with α = 0.1 B + 1 should be a multiple of 5. The 5582 minimum acceptable value is 499. 5583 5584Interaction with R 5585 5586 R_lib: on (the default) or off. When sending instructions to be executed 5587 by R, use the R shared library by preference to the R executable, if the 5588 library is available. 5589 5590 R_functions: off (the default) or on. Recognize functions defined in R as 5591 if they were native functions (the namespace prefix "R." is required). See 5592 chapter 44 of the Gretl User's Guide for details on this and the previous 5593 item. 5594 5595Miscellaneous 5596 5597 mpi_use_smt: on or off (the default). This switch affects the default 5598 number of processes launched in an mpi block within a script. If the 5599 switch is off the default number of processes equals the number of 5600 physical cores on the local machine; if it's on the default is the maximum 5601 number of threads, which will be twice the number of physical cores if the 5602 cores support SMT (Simultaneous MultiThreading, also known as 5603 Hyper-Threading). This applies only if the user has not specified a number 5604 of processes, either directly or indirectly (by specifying a hosts file 5605 for use with MPI). 5606 5607 graph_theme: a string, one of altpoints, classic, dark2 (the current 5608 default), ethan, iwanthue or sober. This sets the "theme" used for graphs 5609 produced by gretl. The classic option reverts to the single theme that was 5610 in force prior to version 2020c of gretl. 5611 5612# setinfo Dataset 5613 5614Argument: series 5615Options: --description=string (set description) 5616 --graph-name=string (set graph name) 5617 --discrete (mark series as discrete) 5618 --continuous (mark series as continuous) 5619 --coded (mark as an encoding) 5620 --numeric (mark as not an encoding) 5621 --midas (mark as component of high-frequency data) 5622Examples: setinfo x1 --description="Description of x1" 5623 setinfo y --graph-name="Some string" 5624 setinfo z --discrete 5625 5626If the options --description or --graph-name are invoked the argument must 5627be a single series, otherwise it may be a list of series in which case it 5628operates on all members of the list. This command sets up to four attributes 5629as follows. 5630 5631If the --description flag is given followed by a string in double quotes, 5632that string is used to set the variable's descriptive label. This label is 5633shown in response to the "labels" command, and is also shown in the main 5634window of the GUI program. 5635 5636If the --graph-name flag is given followed by a quoted string, that string 5637will be used in place of the variable's name in graphs. 5638 5639If one or other of the --discrete or --continuous option flags is given, the 5640variable's numerical character is set accordingly. The default is to treat 5641all series as continuous; setting a series as discrete affects the way the 5642variable is handled in other commands and functions, such as for example 5643"freq" or "dummify" . 5644 5645If one or other of the --coded or --numeric option flags is given, the 5646status of the given series is set accordingly. The default is to treat all 5647numerical values as meaningful as such, at least in an ordinal sense; 5648setting a series as coded means that the numerical values are an arbitrary 5649encoding of qualitative characteristics. 5650 5651The --midas option sets a flag indicating that a given series holds data of 5652a higher frequency than the base frequency of the dataset; for example, the 5653dataset is quarterly and the series holds values for month 1, 2 or 3 of each 5654quarter. (MIDAS = Mixed Data Sampling.) 5655 5656Menu path: /Variable/Edit attributes 5657Other access: Main window pop-up menu 5658 5659# setmiss Dataset 5660 5661Arguments: value [ varlist ] 5662Examples: setmiss -1 5663 setmiss 100 x2 5664 5665Get the program to interpret some specific numerical data value (the first 5666parameter to the command) as a code for "missing", in the case of imported 5667data. If this value is the only parameter, as in the first example above, 5668the interpretation will be applied to all series in the data set. If "value" 5669is followed by a list of variables, by name or number, the interpretation is 5670confined to the specified variable(s). Thus in the second example the data 5671value 100 is interpreted as a code for "missing", but only for the variable 5672x2. 5673 5674Menu path: /Data/Set missing value code 5675 5676# setobs Dataset 5677 5678Variants: setobs periodicity startobs 5679 setobs unitvar timevar --panel-vars 5680Options: --cross-section (interpret as cross section) 5681 --time-series (interpret as time series) 5682 --special-time-series (see below) 5683 --stacked-cross-section (interpret as panel data) 5684 --stacked-time-series (interpret as panel data) 5685 --panel-vars (use index variables, see below) 5686 --panel-time (see below) 5687 --panel-groups (see below) 5688Examples: setobs 4 1990:1 --time-series 5689 setobs 12 1978:03 5690 setobs 1 1 --cross-section 5691 setobs 20 1:1 --stacked-time-series 5692 setobs unit year --panel-vars 5693 5694This command forces the program to interpret the current data set as having 5695a specified structure. 5696 5697In the first form of the command the periodicity, which must be an integer, 5698represents frequency in the case of time-series data (1 = annual; 4 = 5699quarterly; 12 = monthly; 52 = weekly; 5, 6, or 7 = daily; 24 = hourly). In 5700the case of panel data the periodicity means the number of lines per data 5701block: this corresponds to the number of cross-sectional units in the case 5702of stacked cross-sections, or the number of time periods in the case of 5703stacked time series. In the case of simple cross-sectional data the 5704periodicity should be set to 1. 5705 5706The starting observation represents the starting date in the case of time 5707series data. Years may be given with two or four digits; subperiods (for 5708example, quarters or months) should be separated from the year with a colon. 5709In the case of panel data the starting observation should be given as 1:1; 5710and in the case of cross-sectional data, as 1. Starting observations for 5711daily or weekly data should be given in the form YYYY-MM-DD (or simply as 1 5712for undated data). 5713 5714Certain time-series periodicities have standard interpretations -- for 5715example, 12 = monthly and 4 = quarterly. If you have unusual time-series 5716data to which the standard interpretation does not apply, you can signal 5717this by giving the --special-time-series option. In that case gretl will not 5718(for example) report your frequency-12 data as being monthly. 5719 5720If no explicit option flag is given to indicate the structure of the data 5721the program will attempt to guess the structure from the information given. 5722 5723The second form of the command (which requires the --panel-vars flag) may be 5724used to impose a panel interpretation when the data set contains variables 5725that uniquely identify the cross-sectional units and the time periods. The 5726data set will be sorted as stacked time series, by ascending values of the 5727units variable, unitvar. 5728 5729Panel-specific options 5730 5731The --panel-time and --panel-groups options can only be used with a dataset 5732which has already been defined as a panel. 5733 5734The purpose of --panel-time is to set extra information regarding the time 5735dimension of the panel. This should be given on the pattern of the first 5736form of setobs noted above. For example, the following may be used to 5737indicate that the time dimension of a panel is quarterly, starting in the 5738first quarter of 1990. 5739 5740 setobs 4 1990:1 --panel-time 5741 5742The purpose of --panel-groups is to create a string-valued series holding 5743names for the groups (individuals, cross-sectional units) in the panel. 5744(This will be used where appropriate in panel graphs.) With this option you 5745supply either one or two arguments as follows. 5746 5747First case: the (single) argument is the name of a string-valued series. If 5748the number of distinct values equals the number of groups in the panel this 5749series is used to define the group names. If necessary, the numerical 5750content of the series will be adjusted such that the values are all 1s for 5751the first group, all 2s for the second, and so on. If the number of string 5752values doesn't match the number of groups an error is flagged. 5753 5754Second case: the first argument is the name of a series and the second is a 5755string literal or variable holding a name for each group. The series will be 5756created if it does not already exist. If the second argument is a string 5757literal or string variable the group names should be separated by spaces; if 5758a name includes spaces it should be wrapped in backslash-escaped 5759double-quotes. Alternatively the second argument may be an array of strings. 5760 5761For example, the following will create a series named country in which the 5762names in cstrs are each repeated T times, T being the time-series length of 5763the panel. 5764 5765 string cstrs = sprintf("France Germany Italy \"United Kingdom\"") 5766 setobs country cstrs --panel-groups 5767 5768Menu path: /Data/Dataset structure 5769 5770# setopt Programming 5771 5772Arguments: command [ action ] options 5773Examples: setopt mle --hessian 5774 setopt ols persist --quiet 5775 setopt ols clear 5776 See also gdp_midas.inp 5777 5778This command enables the pre-setting of options for a specified command. 5779Ordinarily this is not required, but it may be useful for the writers of 5780hansl functions when they wish to make certain command options conditional 5781on the value of an argument supplied by the caller. 5782 5783For example, suppose a function offers a boolean "quiet" switch, whose 5784intended effect is to suppress the printing of results from a certain 5785regression executed within the function. In that case one might write: 5786 5787 if quiet 5788 setopt ols --quiet 5789 endif 5790 ols ... 5791 5792The --quiet option will then be applied to the next ols command if and only 5793if the variable quiet has a non-zero value. 5794 5795By default, options set in this way apply only to the following instance of 5796command; they are not persistent. However if you give persist as the value 5797for action the options will continue to apply to the given command until 5798further notice. The antidote to the persist action is clear: this erases any 5799stored setting for the specified command. 5800 5801It should be noted that options set via setopt are compounded with any 5802options attached to the target command directly. So for example one might 5803append the --hessian option to an mle command unconditionally but use setopt 5804to add --quiet conditionally. 5805 5806# shell Utilities 5807 5808Argument: shellcommand 5809Examples: ! ls -al 5810 ! notepad 5811 launch notepad 5812 5813An exclamation mark, "!", or the keyword "launch", at the beginning of a 5814command line is interpreted as an escape to the user's shell. Thus arbitrary 5815shell commands can be executed from within gretl. When "!" is used, the 5816external command is executed synchronously. That is, gretl waits for it to 5817complete before proceeding. If you want to start another program from within 5818gretl and not wait for its completion (asynchronous operation), use "launch" 5819instead. 5820 5821For reasons of security this facility is not enabled by default. To activate 5822it, check the box titled "Allow shell commands" under 5823Tools/Preferences/General in the GUI program. This also makes shell commands 5824available in the command-line program (and is the only way to do so). 5825 5826# smpl Dataset 5827 5828Variants: smpl startobs endobs 5829 smpl +i -j 5830 smpl dumvar --dummy 5831 smpl condition --restrict 5832 smpl --no-missing [ varlist ] 5833 smpl --no-all-missing [ varlist ] 5834 smpl --contiguous [ varlist ] 5835 smpl n --random 5836 smpl full 5837Options: --dummy (argument is a dummy variable) 5838 --restrict (apply boolean restriction) 5839 --replace (replace any existing boolean restriction) 5840 --no-missing (restrict to valid observations) 5841 --no-all-missing (omit empty observations (see below)) 5842 --contiguous (see below) 5843 --random (form random sub-sample) 5844 --permanent (see below) 5845 --balanced (panel data: try to retain balanced panel) 5846 --unit (panel data: sample in cross-sectional dimension) 5847 --quiet (don't report sample range) 5848Examples: smpl 3 10 5849 smpl 1960:2 1982:4 5850 smpl +1 -1 5851 smpl x > 3000 --restrict 5852 smpl y > 3000 --restrict --replace 5853 smpl 100 --random 5854 5855Resets the sample range. The new range can be defined in several ways. In 5856the first alternate form (and the first two examples) above, startobs and 5857endobs must be consistent with the periodicity of the data. Either one may 5858be replaced by a semicolon to leave the value unchanged. In the second form, 5859the integers i and j (which may be positive or negative, and should be 5860signed) are taken as offsets relative to the existing sample range. In the 5861third form dummyvar must be an indicator variable with values 0 or 1 at each 5862observation; the sample will be restricted to observations where the value 5863is 1. The fourth form, using --restrict, restricts the sample to 5864observations that satisfy the given Boolean condition (which is specified 5865according to the syntax of the "genr" command). 5866 5867The options --no-missing and --no-all-missing may be used to exclude from 5868the sample observations for which data are missing. The first variant 5869excludes those rows in the dataset for which at least one variable has a 5870missing value, while the second excludes just those rows on which all 5871variables have missing values. In each case the test is confined to the 5872variables in varlist if this argument is given, otherwise it is applied to 5873all series -- with the qualification that in the case of --no-all-missing 5874and no varlist, the generic variables index and time are ignored. 5875 5876The --contiguous form of smpl is intended for use with time series data. The 5877effect is to trim any observations at the start and end of the current 5878sample range that contain missing values (either for the variables in 5879varlist, or for all data series if no varlist is given). Then a check is 5880performed to see if there are any missing values in the remaining range; if 5881so, an error is flagged. 5882 5883With the --random flag, the specified number of cases are selected from the 5884current dataset at random (without replacement). If you wish to be able to 5885replicate this selection you should set the seed for the random number 5886generator first (see the "set" command). 5887 5888The final form, smpl full, restores the full data range. 5889 5890Note that sample restrictions are, by default, cumulative: the baseline for 5891any smpl command is the current sample. If you wish the command to act so as 5892to replace any existing restriction you can add the option flag --replace to 5893the end of the command. (But this option is not compatible with the 5894--contiguous option.) 5895 5896The internal variable obs may be used with the --restrict form of smpl to 5897exclude particular observations from the sample. For example 5898 5899 smpl obs!=4 --restrict 5900 5901will drop just the fourth observation. If the data points are identified by 5902labels, 5903 5904 smpl obs!="USA" --restrict 5905 5906will drop the observation with label "USA". 5907 5908One point should be noted about the --dummy, --restrict and --no-missing 5909forms of smpl: "structural" information in the data file (regarding the time 5910series or panel nature of the data) is likely to be lost when this command 5911is issued. You may reimpose structure with the "setobs" command. A related 5912option, for use with panel data, is the --balanced flag: this requests that 5913a balanced panel is reconstituted after sub-sampling, via the insertion of 5914"missing rows" if need be. But note that it is not always possible to comply 5915with this request. 5916 5917The --unit option is specific to panel data: it allows you to specify a 5918range of "individuals" directly. For example: 5919 5920 # limit the sample to the first 50 individuals 5921 smpl 1 50 --unit 5922 5923By default, restrictions on the current sample range can be undone: you can 5924restore the full dataset via smpl full. However, the --permanent flag can be 5925used to substitute the restricted dataset for the original. If you give the 5926--permanent option with no other arguments or options the effect is to 5927shrink the dataset to the current sample range. 5928 5929Please see chapter 5 of the Gretl User's Guide for further details. 5930 5931Menu path: /Sample 5932 5933# spearman Statistics 5934 5935Arguments: series1 series2 5936Option: --verbose (print ranked data) 5937 5938Prints Spearman's rank correlation coefficient for the series series1 and 5939series2. The variables do not have to be ranked manually in advance; the 5940function takes care of this. 5941 5942The automatic ranking is from largest to smallest (i.e. the largest data 5943value gets rank 1). If you need to invert this ranking, create a new 5944variable which is the negative of the original. For example: 5945 5946 series altx = -x 5947 spearman altx y 5948 5949Menu path: /Tools/Nonparametric tests/Correlation 5950 5951# sprintf Printing 5952 5953Obsolete command: please use the "sprintf" function instead. 5954 5955# square Transformations 5956 5957Argument: varlist 5958Option: --cross (generate cross-products as well as squares) 5959 5960Generates new series which are squares of the series in varlist (plus 5961cross-products if the --cross option is given). For example, "square x y" 5962will generate sq_x = x squared, sq_y = y squared and (optionally) x_y = x 5963times y. If a particular variable is a dummy variable it is not squared 5964because we will get the same variable. 5965 5966Menu path: /Add/Squares of selected variables 5967 5968# stdize Transformations 5969 5970Argument: varlist 5971Options: --no-df-corr (no degrees of freedom correction) 5972 --center-only (don't divide by s.d.) 5973 5974By default a standardized version of each of the series in varlist is 5975obtained and the result stored in a new series with the prefix s_. For 5976example, "stdize x y" creates the new series s_x and s_y, each of which is 5977centered and divided by its sample standard deviation (with a degrees of 5978freedom correction of 1). 5979 5980If the --no-df-corr option is given no degrees of freedom correction is 5981applied; the standard deviation used is the maximum likelihood estimator. If 5982--center-only is given the series just have their means subtracted, and in 5983that case the output names have prefix c_ rather than s_. 5984 5985The functionality of this command is available in somewhat more flexible 5986form via the "stdize" function. 5987 5988Menu path: /Add/Standardize selected variables 5989 5990# store Dataset 5991 5992Arguments: filename [ varlist ] 5993Options: --omit-obs (see below, on CSV format) 5994 --no-header (see below, on CSV format) 5995 --gnu-octave (use GNU Octave format) 5996 --gnu-R (format friendly for read.table) 5997 --gzipped[=level] (apply gzip compression) 5998 --jmulti (use JMulti ASCII format) 5999 --dat (use PcGive ASCII format) 6000 --decimal-comma (use comma as decimal character) 6001 --database (use gretl database format) 6002 --overwrite (see below, on database format) 6003 --comment=string (see below) 6004 --matrix=matrix-name (see below) 6005 --compat (gdtb compatibility, see below) 6006 6007Save data to filename. By default all currently defined series are saved but 6008the optional varlist argument can be used to select a subset of series. If 6009the dataset is sub-sampled, only the observations in the current sample 6010range are saved. 6011 6012The output file will be written in the currently set "workdir", unless the 6013filename string contains a full path specification. 6014 6015Note that the store command behaves in a special manner in the context of a 6016"progressive loop"; see chapter 13 of the Gretl User's Guide for details. 6017 6018Native formats 6019 6020If filename has extension .gdt or .gtdb this implies saving the data in one 6021of gretl's native formats. In addition, if no extension is given .gdt is 6022taken to be implicit and the suffix is added automatically. The gdt format 6023is XML, optionally gzip-compressed, while the gdtb format is binary. The 6024former is recommended for datasets of moderate size (say, up to several 6025hundred kilobytes of data); the binary format is much faster for very large 6026datasets. 6027 6028The gdtb format was revised in gretl 2021a (producing a huge write/read 6029speed-up for super-large datasets). But if you wish to write a binary data 6030file readable by earlier gretl (2018c or higher) you should append the 6031--compat option. 6032 6033When data are saved in gdt format the --gzipped option may be used for data 6034compression. The optional parameter for this flag controls the level of 6035compression (from 0 to 9): higher levels produce a smaller file, but 6036compression takes longer. The default level is 1; a level of 0 means that no 6037compression is applied. 6038 6039Other formats 6040 6041The format in which the data are written may be controlled to a degree by 6042the extension or suffix of filename, as follows: 6043 6044 .csv: comma-separated values (CSV). 6045 6046 .txt or .asc: space-separated values. 6047 6048 .m: GNU Octave matrix format. 6049 6050 .dta: Stata dta format (version 113). 6051 6052The format-related option flags shown above can be used to force the choice 6053of format independently of the filename (or to get gretl to write in the 6054formats of PcGive or JMulTi). 6055 6056CSV options 6057 6058The option flags --omit-obs and --no-header are specific to saving data in 6059CSV format. By default, if the data are time series or panel, or if the 6060dataset includes specific observation markers, the output file includes a 6061first column identifying the observations (e.g. by date). If the --omit-obs 6062flag is given this column is omitted. The --no-header flag suppresses the 6063usual printing of the names of the variables at the top of the columns. 6064 6065The option flag --decimal-comma is also confined to CSV. Its effect is to 6066replace the decimal point with decimal comma; in addition the column 6067separator is forced to be a semicolon rather than a comma. 6068 6069Storing to a database 6070 6071The option of saving in gretl database format is intended to help with the 6072construction of large sets of series with mixed frequencies and ranges of 6073observations. At present this option is available only for annual, quarterly 6074or monthly time-series data. If you save to a file that already exists, the 6075default action is to append the newly saved series to the existing content 6076of the database. In this context it is an error if one or more of the 6077variables to be saved has the same name as a variable that is already 6078present in the database. The --overwrite flag has the effect that, if there 6079are variable names in common, the newly saved variable replaces the variable 6080of the same name in the original dataset. 6081 6082The --comment option is available when saving data as a database or as CSV. 6083The required parameter is a double-quoted one-line string, attached to the 6084option flag with an equals sign. The string is inserted as a comment into 6085the database index file or at the top of the CSV output. 6086 6087Writing a matrix as a dataset 6088 6089The --matrix option requires a parameter, the name of a (non-empty) matrix. 6090The effect of store is then, in effect, to turn the matrix into a dataset 6091"in the background" and write it to file as such. Matrix columns become 6092series; their names are taken from column-names attached to the matrix, if 6093any, or by default are assigned as v1, v2 and so on. If the matrix has row 6094names attached these are used as "observation markers" in the dataset. 6095 6096Note that matrices can be written to file in their own right, see the 6097"mwrite" function. But in some cases it may be useful to write them in 6098dataset mode. 6099 6100Menu path: /File/Save data; /File/Export data 6101 6102# summary Statistics 6103 6104Variants: summary [ varlist ] 6105 summary --matrix=matname 6106Options: --simple (basic statistics only) 6107 --weight=wvar (weighting variable) 6108 --by=byvar (see below) 6109Examples: frontier.inp 6110 6111In its first form, this command prints summary statistics for the variables 6112in varlist, or for all the variables in the data set if varlist is omitted. 6113By default, output consists of the mean, standard deviation (sd), 6114coefficient of variation (= sd/mean), median, minimum, maximum, skewness 6115coefficient, and excess kurtosis. If the --simple option is given, output is 6116restricted to the mean, minimum, maximum and standard deviation. 6117 6118If the --by option is given (in which case the parameter byvar should be the 6119name of a discrete variable), then statistics are printed for sub-samples 6120corresponding to the distinct values taken on by byvar. For example, if 6121byvar is a (binary) dummy variable, statistics are given for the cases byvar 6122= 0 and byvar = 1. Note: at present, this option is incompatible with the 6123--weight option. 6124 6125If the alternative form is given, using a named matrix, then summary 6126statistics are printed for each column of the matrix. The --by option is not 6127available in this case. 6128 6129The table of statistics produced by summary can be retrieved in matrix form 6130via the "$result" accessor. 6131 6132Menu path: /View/Summary statistics 6133Other access: Main window pop-up menu 6134 6135# system Estimation 6136 6137Variants: system method=estimator 6138 sysname <- system 6139Examples: "Klein Model 1" <- system 6140 system method=sur 6141 system method=3sls 6142 See also klein.inp, kmenta.inp, greene14_2.inp 6143 6144Starts a system of equations. Either of two forms of the command may be 6145given, depending on whether you wish to save the system for estimation in 6146more than one way or just estimate the system once. 6147 6148To save the system you should assign it a name, as in the first example (if 6149the name contains spaces it must be surrounded by double quotes). In this 6150case you estimate the system using the "estimate" command. With a saved 6151system of equations, you are able to impose restrictions (including 6152cross-equation restrictions) using the "restrict" command. 6153 6154Alternatively you can specify an estimator for the system using method= 6155followed by a string identifying one of the supported estimators: "ols" 6156(Ordinary Least Squares), "tsls" (Two-Stage Least Squares) "sur" (Seemingly 6157Unrelated Regressions), "3sls" (Three-Stage Least Squares), "fiml" (Full 6158Information Maximum Likelihood) or "liml" (Limited Information Maximum 6159Likelihood). In this case the system is estimated once its definition is 6160complete. 6161 6162An equation system is terminated by the line "end system". Within the system 6163four sorts of statement may be given, as follows. 6164 6165 "equation": specify an equation within the system. 6166 6167 "instr": for a system to be estimated via Three-Stage Least Squares, a 6168 list of instruments (by variable name or number). Alternatively, you can 6169 put this information into the "equation" line using the same syntax as in 6170 the "tsls" command. 6171 6172 "endog": for a system of simultaneous equations, a list of endogenous 6173 variables. This is primarily intended for use with FIML estimation, but 6174 with Three-Stage Least Squares this approach may be used instead of giving 6175 an "instr" list; then all the variables not identified as endogenous will 6176 be used as instruments. 6177 6178 "identity": for use with FIML, an identity linking two or more of the 6179 variables in the system. This sort of statement is ignored when an 6180 estimator other than FIML is used. 6181 6182After estimation using the "system" or "estimate" commands the following 6183accessors can be used to retrieve additional information: 6184 6185 $uhat: the matrix of residuals, one column per equation. 6186 6187 $yhat: matrix of fitted values, one column per equation. 6188 6189 $coeff: column vector of coefficients (all the coefficients from the first 6190 equation, followed by those from the second equation, and so on). 6191 6192 $vcv: covariance matrix of the coefficients. If there are k elements in 6193 the $coeff vector, this matrix is k by k. 6194 6195 $sigma: cross-equation residual covariance matrix. 6196 6197 $sysGamma, $sysA and $sysB: structural-form coefficient matrices (see 6198 below). 6199 6200If you want to retrieve the residuals or fitted values for a specific 6201equation as a data series, select a column from the $uhat or $yhat matrix 6202and assign it to a series, as in 6203 6204 series uh1 = $uhat[,1] 6205 6206The structural-form matrices correspond to the following representation of a 6207simultaneous equations model: 6208 6209 Gamma y(t) = A y(t-1) + B x(t) + e(t) 6210 6211If there are n endogenous variables and k exogenous variables, Gamma is an n 6212x n matrix and B is n x k. If the system contains no lags of the endogenous 6213variables then the A matrix is not present. If the maximum lag of an 6214endogenous regressor is p, the A matrix is n x np. 6215 6216Menu path: /Model/Simultaneous equations 6217 6218# tabprint Printing 6219 6220Options: --output=filename (send output to specified file) 6221 --format="f1|f2|f3|f4" (Specify custom TeX format) 6222 --complete (TeX-related, see below) 6223 6224Must follow the estimation of a model. Prints the model in tabular form. The 6225format is governed by the extension of the specified filename: ".tex" for 6226LaTeX, ".rtf" for RTF (Microsoft's Rich Text Format), or ".csv" for 6227comma-separated. The file will be written in the currently set "workdir", 6228unless filename contains a full path specification. 6229 6230If CSV format is selected, values are comma-separated unless the decimal 6231comma is in force, in which case the separator is the semicolon. 6232 6233Options specific to LaTeX output 6234 6235If the --complete flag is given the LaTeX file is a complete document, ready 6236for processing; otherwise it must be included in a document. 6237 6238If you wish alter the appearance of the tabular output, you can specify a 6239custom row format using the --format flag. The format string must be 6240enclosed in double quotes and must be tied to the flag with an equals sign. 6241The pattern for the format string is as follows. There are four fields, 6242representing the coefficient, standard error, t-ratio and p-value 6243respectively. These fields should be separated by vertical bars; they may 6244contain a printf-type specification for the formatting of the numeric value 6245in question, or may be left blank to suppress the printing of that column 6246(subject to the constraint that you can't leave all the columns blank). Here 6247are a few examples: 6248 6249 --format="%.4f|%.4f|%.4f|%.4f" 6250 --format="%.4f|%.4f|%.3f|" 6251 --format="%.5f|%.4f||%.4f" 6252 --format="%.8g|%.8g||%.4f" 6253 6254The first of these specifications prints the values in all columns using 4 6255decimal places. The second suppresses the p-value and prints the t-ratio to 62563 places. The third omits the t-ratio. The last one again omits the t, and 6257prints both coefficient and standard error to 8 significant figures. 6258 6259Once you set a custom format in this way, it is remembered and used for the 6260duration of the gretl session. To revert to the default format you can use 6261the special variant --format=default. 6262 6263Menu path: Model window, /LaTeX 6264 6265# textplot Graphs 6266 6267Argument: varlist 6268Options: --time-series (plot by observation) 6269 --one-scale (force a single scale) 6270 --tall (use 40 rows) 6271 6272Quick and simple ASCII graphics. Without the --time-series flag, varlist 6273must contain at least two series, the last of which is taken as the variable 6274for the x axis, and a scatter plot is produced. In this case the --tall 6275option may be used to produce a graph in which the y axis is represented by 627640 rows of characters (the default is 20 rows). 6277 6278With the --time-series, a plot by observation is produced. In this case the 6279option --one-scale may be used to force the use of a single scale; otherwise 6280if varlist contains more than one series the data may be scaled. Each line 6281represents an observation, with the data values plotted horizontally. 6282 6283See also "gnuplot". 6284 6285# tobit Estimation 6286 6287Arguments: depvar indepvars 6288Options: --llimit=lval (specify left bound) 6289 --rlimit=rval (specify right bound) 6290 --vcv (print covariance matrix) 6291 --robust (robust standard errors) 6292 --opg (see below) 6293 --cluster=clustvar (see "logit" for explanation) 6294 --verbose (print details of iterations) 6295 --quiet (don't print results) 6296 6297Estimates a Tobit model, which may be appropriate when the dependent 6298variable is "censored". For example, positive and zero values of purchases 6299of durable goods on the part of individual households are observed, and no 6300negative values, yet decisions on such purchases may be thought of as 6301outcomes of an underlying, unobserved disposition to purchase that may be 6302negative in some cases. 6303 6304By default it is assumed that the dependent variable is censored at zero on 6305the left and is uncensored on the right. However you can use the options 6306--llimit and --rlimit to specify a different pattern of censoring. Note that 6307if you specify a right bound only, the assumption is then that the dependent 6308variable is uncensored on the left. 6309 6310The Tobit model is a special case of interval regression. Please see the 6311"intreg" command for further details, including an account of the --robust 6312and --opg options. 6313 6314Menu path: /Model/Limited dependent variable/Tobit 6315 6316# tsls Estimation 6317 6318Arguments: depvar indepvars ; instruments 6319Options: --no-tests (don't do diagnostic tests) 6320 --vcv (print covariance matrix) 6321 --quiet (don't print results) 6322 --no-df-corr (no degrees-of-freedom correction) 6323 --robust (robust standard errors) 6324 --cluster=clustvar (clustered standard errors) 6325 --liml (use Limited Information Maximum Likelihood) 6326 --gmm (use the Generalized Method of Moments) 6327Examples: tsls y1 0 y2 y3 x1 x2 ; 0 x1 x2 x3 x4 x5 x6 6328 See also penngrow.inp 6329 6330Computes Instrumental Variables (IV) estimates, by default using two-stage 6331least squares (TSLS) but see below for further options. The dependent 6332variable is depvar, indepvars is the list of regressors (which is presumed 6333to include at least one endogenous variable); and instruments is the list of 6334instruments (exogenous and/or predetermined variables). If the instruments 6335list is not at least as long as indepvars, the model is not identified. 6336 6337In the above example, the ys are endogenous and the xs are the exogenous 6338variables. Note that exogenous regressors should appear in both lists. 6339 6340Output for two-stage least squares estimates includes the Hausman test and, 6341if the model is over-identified, the Sargan over-identification test. In the 6342Hausman test, the null hypothesis is that OLS estimates are consistent, or 6343in other words estimation by means of instrumental variables is not really 6344required. A model of this sort is over-identified if there are more 6345instruments than are strictly required. The Sargan test is based on an 6346auxiliary regression of the residuals from the two-stage least squares model 6347on the full list of instruments. The null hypothesis is that all the 6348instruments are valid, and suspicion is thrown on this hypothesis if the 6349auxiliary regression has a significant degree of explanatory power. For a 6350good explanation of both tests see chapter 8 of Davidson and MacKinnon 6351(2004). 6352 6353For both TSLS and LIML estimation, an additional test result is shown 6354provided that the model is estimated under the assumption of i.i.d. errors 6355(that is, the --robust option is not selected). This is a test for weakness 6356of the instruments. Weak instruments can lead to serious problems in IV 6357regression: biased estimates and/or incorrect size of hypothesis tests based 6358on the covariance matrix, with rejection rates well in excess of the nominal 6359significance level (Stock, Wright and Yogo, 2002). The test statistic is the 6360first-stage F-test if the model contains just one endogenous regressor, 6361otherwise it is the smallest eigenvalue of the matrix counterpart of the 6362first stage F. Critical values based on the Monte Carlo analysis of Stock 6363and Yogo (2003) are shown when available. 6364 6365The R-squared value printed for models estimated via two-stage least squares 6366is the square of the correlation between the dependent variable and the 6367fitted values. 6368 6369For details on the effects of the --robust and --cluster options, please see 6370the help for "ols". 6371 6372As alternatives to TSLS, the model may be estimated via Limited Information 6373Maximum Likelihood (the --liml option) or via the Generalized Method of 6374Moments (--gmm option). Note that if the model is just identified these 6375methods should produce the same results as TSLS, but if it is 6376over-identified the results will differ in general. 6377 6378If GMM estimation is selected, the following additional options become 6379available: 6380 6381 --two-step: perform two-step GMM rather than the default of one-step. 6382 6383 --iterate: Iterate GMM to convergence. 6384 6385 --weights=Wmat: specify a square matrix of weights to be used when 6386 computing the GMM criterion function. The dimension of this matrix must 6387 equal the number of instruments. The default is an appropriately sized 6388 identity matrix. 6389 6390Menu path: /Model/Instrumental variables 6391 6392# var Estimation 6393 6394Arguments: order ylist [ ; xlist ] 6395Options: --nc (do not include a constant) 6396 --trend (include a linear trend) 6397 --seasonals (include seasonal dummy variables) 6398 --robust (robust standard errors) 6399 --robust-hac (HAC standard errors) 6400 --quiet (skip output of individual equations) 6401 --silent (don't print anything) 6402 --impulse-responses (print impulse responses) 6403 --variance-decomp (print variance decompositions) 6404 --lagselect (show criteria for lag selection) 6405 --minlag=minimum lag (lag selection only, see below) 6406Examples: var 4 x1 x2 x3 ; time mydum 6407 var 4 x1 x2 x3 --seasonals 6408 var 12 x1 x2 x3 --lagselect 6409 See also sw_ch14.inp 6410 6411Sets up and estimates (using OLS) a vector autoregression (VAR). The first 6412argument specifies the lag order -- or the maximum lag order in case the 6413--lagselect option is given (see below). The order may be given numerically, 6414or as the name of a pre-existing scalar variable. Then follows the setup for 6415the first equation. Do not include lags among the elements of ylist -- they 6416will be added automatically. The semi-colon separates the stochastic 6417variables, for which order lags will be included, from any exogenous 6418variables in xlist. Note that a constant is included automatically unless 6419you give the --nc flag, a trend can be added with the --trend flag, and 6420seasonal dummy variables may be added using the --seasonals flag. 6421 6422While a VAR specification usually includes all lags from 1 to a given 6423maximum, it is possible to select a specific set of lags. To do this, 6424substitute for the regular (scalar) order argument either the name of a 6425predefined vector or a comma-separated list of lags, enclosed in braces. We 6426show below two ways of specifying that a VAR should include lags 1, 2 and 4 6427(but not lag 3): 6428 6429 var {1,2,4} ylist 6430 matrix p = {1,2,4} 6431 var p ylist 6432 6433A separate regression is reported for each variable in ylist. Output for 6434each equation includes F-tests for zero restrictions on all lags of each of 6435the variables, an F-test for the significance of the maximum lag, and, if 6436the --impulse-responses flag is given, forecast variance decompositions and 6437impulse responses. 6438 6439Forecast variance decompositions and impulse responses are based on the 6440Cholesky decomposition of the contemporaneous covariance matrix, and in this 6441context the order in which the (stochastic) variables are given matters. The 6442first variable in the list is assumed to be "most exogenous" within-period. 6443The horizon for variance decompositions and impulse responses can be set 6444using the "set" command. For retrieval of a specified impulse response 6445function in matrix form, see the "irf" function. 6446 6447If the --robust option is given, standard errors are corrected for 6448heteroskedasticity. Alternatively, the --robust-hac option can be given to 6449produce standard errors that are robust with respect to both 6450heteroskedasticity and autocorrelation (HAC). In general the latter 6451correction should not be needed if the VAR includes sufficient lags. 6452 6453If the --lagselect option is given, the first parameter to the var command 6454is taken as the maximum lag order. Output consists of a table showing the 6455values of the Akaike (AIC), Schwarz (BIC) and Hannan-Quinn (HQC) information 6456criteria, by default computed from VARs of order 1 to the given maximum. 6457This is intended to help with the selection of the optimal lag order. The 6458usual VAR output is not presented. The table of information criteria may be 6459retrieved as a matrix via the "$test" accessor. In this context (only) the 6460--minlag option can be used to adjust the minimum lag order. Set this to 0 6461to allow for the possibility that the optimal lag order is zero, meaning 6462that a VAR is not really called for at all. Conversely you could set 6463--minlag=4 if you believe you need at least 4 lags, thereby saving a little 6464compute time. 6465 6466Menu path: /Model/Multivariate time series 6467 6468# varlist Dataset 6469 6470Option: --type=typename (scope of listing) 6471 6472By default, prints a listing of the series in the current dataset (if any); 6473"ls" may be used as an alias. 6474 6475If the --type option is given, it should be followed (after an equals sign) 6476by one of the following typenames: series, scalar, matrix, list, string, 6477bundle, array or accessor. The effect is to print the names of all currently 6478defined objects of the named type. 6479 6480As a special case, if the typename is accessor, the names printed are those 6481of the internal variables currently available as "accessors", such as 6482"$nobs" and "$uhat", regardless of their specific type. 6483 6484# vartest Tests 6485 6486Arguments: series1 series2 6487 6488Calculates the F statistic for the null hypothesis that the population 6489variances for the variables series1 and series2 are equal, and shows its 6490p-value. The test statistics and the p-value can be retrieved through the 6491accessors "$test" and "$pvalue", respectively. The following code 6492 6493 open AWM18.gdt 6494 vartest EEN EXR 6495 eval $test 6496 eval $pvalue 6497 6498computes the test and shows how to retrieve the test statistics and 6499corresponding p-value afterwards: 6500 6501 Equality of variances test 6502 6503 EEN: Number of observations = 192 6504 EXR: Number of observations = 188 6505 Ratio of sample variances = 3.70707 6506 Null hypothesis: The two population variances are equal 6507 Test statistic: F(191,187) = 3.70707 6508 p-value (two-tailed) = 1.94866e-18 6509 6510 3.7070716 6511 1.9486605e-18 6512 6513Menu path: /Tools/Test statistic calculator 6514 6515# vecm Estimation 6516 6517Arguments: order rank ylist [ ; xlist ] [ ; rxlist ] 6518Options: --nc (no constant) 6519 --rc (restricted constant) 6520 --uc (unrestricted constant) 6521 --crt (constant and restricted trend) 6522 --ct (constant and unrestricted trend) 6523 --seasonals (include centered seasonal dummies) 6524 --quiet (skip output of individual equations) 6525 --silent (don't print anything) 6526 --impulse-responses (print impulse responses) 6527 --variance-decomp (print variance decompositions) 6528Examples: vecm 4 1 Y1 Y2 Y3 6529 vecm 3 2 Y1 Y2 Y3 --rc 6530 vecm 3 2 Y1 Y2 Y3 ; X1 --rc 6531 See also denmark.inp, hamilton.inp 6532 6533A VECM is a form of vector autoregression or VAR (see "var"), applicable 6534where the variables in the model are individually integrated of order 1 6535(that is, are random walks, with or without drift), but exhibit 6536cointegration. This command is closely related to the Johansen test for 6537cointegration (see "johansen"). 6538 6539The order parameter to this command represents the lag order of the VAR 6540system. The number of lags in the VECM itself (where the dependent variable 6541is given as a first difference) is one less than order. 6542 6543The rank parameter represents the cointegration rank, or in other words the 6544number of cointegrating vectors. This must be greater than zero and less 6545than or equal to (generally, less than) the number of endogenous variables 6546given in ylist. 6547 6548ylist supplies the list of endogenous variables, in levels. The inclusion of 6549deterministic terms in the model is controlled by the option flags. The 6550default if no option is specified is to include an "unrestricted constant", 6551which allows for the presence of a non-zero intercept in the cointegrating 6552relations as well as a trend in the levels of the endogenous variables. In 6553the literature stemming from the work of Johansen (see for example his 1995 6554book) this is often referred to as "case 3". The first four options given 6555above, which are mutually exclusive, produce cases 1, 2, 4 and 5 6556respectively. The meaning of these cases and the criteria for selecting a 6557case are explained in chapter 33 of the Gretl User's Guide. 6558 6559The optional lists xlist and rxlist allow you to specify sets of exogenous 6560variables which enter the model either unrestrictedly (xlist) or restricted 6561to the cointegration space (rxlist). These lists are separated from ylist 6562and from each other by semicolons. 6563 6564The --seasonals option, which may be combined with any of the other options, 6565specifies the inclusion of a set of centered seasonal dummy variables. This 6566option is available only for quarterly or monthly data. 6567 6568The first example above specifies a VECM with lag order 4 and a single 6569cointegrating vector. The endogenous variables are Y1, Y2 and Y3. The second 6570example uses the same variables but specifies a lag order of 3 and two 6571cointegrating vectors; it also specifies a "restricted constant", which is 6572appropriate if the cointegrating vectors may have a non-zero intercept but 6573the Y variables have no trend. 6574 6575Following estimation of a VECM some special accessors are available: 6576$jalpha, $jbeta and $jvbeta retrieve, respectively, the α and beta matrices 6577and the estimated variance of beta. For retrieval of a specified impulse 6578response function in matrix form, see the "irf" function. 6579 6580Menu path: /Model/Multivariate time series 6581 6582# vif Tests 6583 6584Option: --quiet (don't print anything) 6585Examples: longley.inp 6586 6587Must follow the estimation of a model which includes at least two 6588independent variables. Calculates and displays diagnostic information 6589pertaining to collinearity. 6590 6591The Variance Inflation Factor or VIF for regressor j is defined as 6592 6593 1/(1 - Rj^2) 6594 6595where R_j is the coefficient of multiple correlation between regressor j and 6596the other regressors. The factor has a minimum value of 1.0 when the 6597variable in question is orthogonal to the other independent variables. 6598Neter, Wasserman, and Kutner (1990) suggest inspecting the largest VIF as a 6599diagnostic for collinearity; a value greater than 10 is sometimes taken as 6600indicating a problematic degree of collinearity. 6601 6602Following this command the "$result" accessor may be used to retrieve a 6603column vector holding the VIFs. For a more sophisticated approach to 6604diagnosing collinearity, see the "bkw" command. 6605 6606Menu path: Model window, /Analysis/Collinearity 6607 6608# wls Estimation 6609 6610Arguments: wtvar depvar indepvars 6611Options: --vcv (print covariance matrix) 6612 --robust (robust standard errors) 6613 --quiet (suppress printing of results) 6614 --allow-zeros (see below) 6615 6616Computes weighted least squares (WLS) estimates using wtvar as the weight, 6617depvar as the dependent variable, and indepvars as the list of independent 6618variables. Let w denote the positive square root of wtvar; then WLS is 6619basically equivalent to an OLS regression of w * depvar on w * indepvars. 6620The R-squared, however, is calculated in a special manner, namely as 6621 6622 R^2 = 1 - ESS / WTSS 6623 6624where ESS is the error sum of squares (sum of squared residuals) from the 6625weighted regression and WTSS denotes the "weighted total sum of squares", 6626which equals the sum of squared residuals from a regression of the weighted 6627dependent variable on the weighted constant alone. 6628 6629As a special case, if wtvar is a 0/1 dummy variable, WLS estimation is 6630equivalent to OLS on a sample that excludes all observations with value zero 6631for wtvar. Otherwise including weights of zero is considered an error, but 6632if you really want to mix zero weights with positive ones you can append the 6633--allow-zeros option. 6634 6635For weighted least squares estimation applied to panel data and based on the 6636unit specific error variances please see the "panel" command with the 6637--unit-weights option. 6638 6639Menu path: /Model/Other linear models/Weighted Least Squares 6640 6641# xcorrgm Statistics 6642 6643Arguments: series1 series2 [ order ] 6644Options: --plot=mode-or-filename (see below) 6645 --quiet (suppress plot) 6646Example: xcorrgm x y 12 6647 6648Prints and graphs the cross-correlogram for series1 and series2, which may 6649be specified by name or number. The values are the sample correlation 6650coefficients between the current value of series1 and successive leads and 6651lags of series2. 6652 6653If an order value is specified the length of the cross-correlogram is 6654limited to at most that number of leads and lags, otherwise the length is 6655determined automatically, as a function of the frequency of the data and the 6656number of observations. 6657 6658By default, a plot of the cross-correlogram is produced: a gnuplot graph in 6659interactive mode or an ASCII graphic in batch mode. This can be adjusted via 6660the --plot option. The acceptable parameters to this option are none (to 6661suppress the plot); ascii (to produce a text graphic even when in 6662interactive mode); display (to produce a gnuplot graph even when in batch 6663mode); or a file name. The effect of providing a file name is as described 6664for the --output option of the "gnuplot" command. 6665 6666Menu path: /View/Cross-correlogram 6667Other access: Main window pop-up menu (multiple selection) 6668 6669# xtab Statistics 6670 6671Arguments: ylist [ ; xlist ] 6672Options: --row (display row percentages) 6673 --column (display column percentages) 6674 --zeros (display zero entries) 6675 --no-totals (suppress printing of marginal counts) 6676 --matrix=matname (use frequencies from named matrix) 6677 --quiet (suppress printed output) 6678 --tex[=filename] (output as LaTeX) 6679 --equal (see the LaTeX case below) 6680Examples: xtab 1 2 6681 xtab 1 ; 2 3 4 6682 xtab --matrix=A 6683 xtab 1 2 --tex="xtab.tex" 6684 See also ooballot.inp 6685 6686Given just the ylist argument, computes (and by default prints) a 6687contingency table or cross-tabulation for each combination of the variables 6688included in the list. If a second list xlist is given, each variable in 6689ylist is cross-tabulated by row against each variable in xlist (by column). 6690Variables in these lists can be referenced by name or by number. Note that 6691all the variables must have been marked as discrete. Alternatively, if the 6692--matrix option is given, the named matrix is treated as a precomputed set 6693of frequencies, to be displayed as a cross-tabulation (see also the "mxtab" 6694function). In this case the list argument(s) should be omitted. 6695 6696By default the cell entries are given as frequency counts. The --row and 6697--column options (which are mutually exclusive) replace the counts with the 6698percentages for each row or column, respectively. By default, cells with a 6699zero count are left blank but the --zeros option has the effect of showing 6700zero counts explicitly, which may be useful for importing the table into 6701another program, such as a spreadsheet. 6702 6703Pearson's chi-square test for independence is shown if the expected 6704frequency under independence is at least 1.0e-7 for all cells. A common rule 6705of thumb for the validity of this statistic is that at least 80 percent of 6706cells should have expected frequencies of 5 or greater; if this criterion is 6707not met a warning is printed. 6708 6709If the contingency table is 2 by 2, Fisher's Exact Test for independence is 6710shown. Note that this test is based on the assumption that the row and 6711column totals are fixed, which may or may not be appropriate depending on 6712how the data were generated. The left p-value should be used when the 6713alternative to independence is negative association (values tend to cluster 6714in the lower left and upper right cells), the right p-value when the 6715alternative is positive association. The two-tailed p-value for this test is 6716calculated by method (b) in section 2.1 of Agresti (1992): it is the sum of 6717the probabilities of all possible tables with the given row and column 6718totals and a probability no greater than that of the observed table. 6719 6720The bivariate case 6721 6722In the case of a bivariate cross-tabulation (only one list is given, and it 6723has two members) certain results are stored. The contingency table may be 6724retrieved in matrix form via the "$result" accessor. In addition, if the 6725minimum expected value condition is met, the Pearson chi-square test and its 6726p-value may be retrieved via the "$test" and "$pvalue" accessors. If it's 6727these results that are of interest, the --quiet option can be used to 6728suppress the usual printout. 6729 6730LaTeX output 6731 6732If the --tex option is given the cross-tabulation is printed in the form of 6733a LaTeX tabular environment, either inline (from where it may be copied and 6734pasted) or, if the filename parameter is appended, to the specified file. 6735(If filename does not specify a full path the file is written in the 6736currently set "workdir".) No test statistic is computed. The additional 6737option --equal can be used to flag, by printing in boldface, the count or 6738percentage for cells in which the row and column variables have the same 6739numerical value. This option is ignored unless the --tex option is given, 6740and also when one or both of the cross-tabulated variables are 6741string-valued. 6742 6743